2D Materials
PAPER • OPEN ACCESS
Recent progress of the Computational 2D
Materials Database (C2DB)
To cite this article: Morten Niklas Gjerding
et al
2021
2D Mater.
8 044002
View the article online for updates and enhancements.
You may also like
High throughput computational screening
for 2D ferromagnetic materials: the critical
role of anisotropy and local correlations
Daniele Torelli, Kristian S Thygesen and
Thomas Olsen
-
Exploring key aspects of art
documentation and examination in
curation technology-using Cheng-Shiu
university conservation center’s project as
a case study
L C Lin and I C Li
-
Machine learning enabled discovery of
application dependent design principles for
two-dimensional materials
Victor Venturi, Holden L Parks, Zeeshan
Ahmad et al.
-
This content was downloaded from IP address 149.125.159.219 on 20/04/2023 at 19:03
2D Mater. 8 (2021) 044002 https://doi.org/10.1088/2053-1583/ac1059
OPEN ACCESS
REC EIVED
18 January 2021
REV ISED
31 May 2021
AC CE PTED FOR P UBLICATION
30 June 2021
PUBLI SHED
15 July 2021
Original Content from
this work may be used
under the terms of the
Creative Commons
Attribution 4.0 licence.
Any further distribution
of this work must
maintain attribution to
the author(s) and the title
of the work, journal
citation and DOI.
PAPER
Recent progress of the Computational 2D Materials Database
(C2DB)
Morten Niklas Gjerding
1
, Alireza Taghizadeh
1,2
, Asbjørn Rasmussen
1
, Sajid Ali
1
,
Fabian Bertoldo
1
, Thorsten Deilmann
3
, Nikolaj Rørbæk Knøsgaard
1
, Mads Kruse
1
,
Ask Hjorth Larsen
1
, Simone Manti
1
, Thomas Garm Pedersen
2
, Urko Petralanda
1
,
Thorbjørn Skovhus
1
, Mark Kamper Svendsen
1
, Jens Jørgen Mortensen
1
, Thomas Olsen
1
and Kristian Sommer Thygesen
1,
1
Computational Atomic-scale Materials Design (CAMD), Department of Physics, Technical University of Denmark, 2800 Kgs. Lyngby,
Denmark
2
Department of Materials and Production, Aalborg University, 9220 Aalborg Øst, Denmark
3
Institut für Festkörpertheorie, Westfälische Wilhelms-Universität Münster, 48149 Münster, Germany
Author to whom any correspondence should be addressed.
E-mail: thygesen@fysik.dtu.dk
Keywords: 2D materials, high-throughput, ab-initio, database, density functional theory
Abstract
The Computational 2D Materials Database (C2DB) is a highly curated open database organising a
wealth of computed properties for more than 4000 atomically thin two-dimensional (2D)
materials. Here we report on new materials and properties that were added to the database since its
first release in 2018. The set of new materials comprise several hundred monolayers exfoliated from
experimentally known layered bulk materials, (homo)bilayers in various stacking configurations,
native point defects in semiconducting monolayers, and chalcogen/halogen Janus monolayers. The
new properties include exfoliation energies, Bader charges, spontaneous polarisations, Born
charges, infrared polarisabilities, piezoelectric tensors, band topology invariants, exchange
couplings, Raman spectra and second harmonic generation spectra. We also describe refinements
of the employed material classification schemes, upgrades of the computational methodologies
used for property evaluations, as well as significant enhancements of the data documentation and
provenance. Finally, we explore the performance of Gaussian process-based regression for efficient
prediction of mechanical and electronic materials properties. The combination of open access,
detailed documentation, and extremely rich materials property data sets make the C2DB a unique
resource that will advance the science of atomically thin materials.
1. Introduction
The discovery of new materials, or new properties
of known materials, to meet a specific industrial
or scientific requirement, is an exciting intellectual
challenge of the utmost importance for our envir-
onment and economy. For example, the successful
transition to a society based on sustainable energy
sources and the realisation of quantum technologies
(e.g. quantum computers and quantum communic-
ation) depend critically on new materials with novel
functionalities. First-principles quantum mechanical
calculations, e.g. based on density functional the-
ory (DFT) [1], can predict the properties of mater-
ials with high accuracy even before they are made
in the lab. They provide insight into mechanisms
at the most fundamental (atomic and electronic)
level and can pinpoint and calculate key properties
that determine the performance of the material at
the macroscopic level. Powered by high-performance
computers, atomistic quantum calculations in com-
bination with data science approaches, have the
potential to revolutionise the way we discover and
develop new materials.
Atomically thin, two-dimensional (2D) crystals
represent a fascinating class of materials with excit-
ing perspectives for both fundamental science and
technology [25]. The family of 2D materials has
been growing steadily over the past decade and counts
about a hundred materials that have been realised
© 2021 The Author(s). Published by IOP Publishing Ltd
2D Mater. 8 (2021) 044002 M N Gjerding et al
in single-layer or few-layer form [610]. While some
of these materials, including graphene, hexagonal
boron nitride (hBN), and transition metal dichal-
cogenides (TMDCs), have been extensively studied,
the majority have only been scarcely characterised
and remain poorly understood. Computational stud-
ies indicate that around 1000 already known layered
crystals have sufficiently weak interlayer (IL) bond-
ing to allow the individual layers to be mechanic-
ally exfoliated [11, 12]. Supposedly, even more 2D
materials could be realised beyond this set of already
known crystals. Adding to this the possibility of stack-
ing individual 2D layers (of the same or different
kinds) into ultrathin van der Waals (vdW) crystals
[13], and tuning the properties of such structures
by varying the relative twist angle between adjacent
layers [14, 15] or intercalating atoms into the vdW
gap [16, 17], it is clear that the prospects of tailor
made 2D materials are simply immense. To support
experimental efforts and navigate the vast 2D mater-
ials space, first-principles calculations play a pivotal
role. In particular, FAIR
5
[18] databases populated by
high-throughput calculations can provide a conveni-
ent overview of known materials and point to new
promising materials with desired (predicted) proper-
ties. Such databases are also a fundamental require-
ment for the successful introduction and deployment
of artificial intelligence in materials science.
Many of the unique properties exhibited by 2D
materials have their origin in quantum confinement
and reduced dielectric screening. These effects tend
to enhance many-body interactions and lead to pro-
foundly new phenomena such as strongly bound
excitons [1921] with nonhydrogenic Rydberg series
[2224], phonons and plasmons with anomalous dis-
persion relations [25, 26], large dielectric band struc-
ture renormalisations [27, 28], unconventional Mott
insulating and superconducting phases [14, 15], and
high-temperature exciton condensates [29]. Recently,
it has become clear that long range magnetic order
can persist [30, 31] and (in-plane) ferroelectricity
even be enhanced [32], in the single layer limit.
In addition, first-principles studies of 2D crystals
have revealed rich and abundant topological phases
[33, 34]. The peculiar physics ruling the world of 2D
materials entails that many of the conventional the-
ories and concepts developed for bulk crystals break
down or require special treatments when applied to
2D materials [26, 35, 36]. This means that com-
putational studies must be performed with extra
care, which in turn calls for well-organised and well-
documented 2D property data sets that can form
the basis for the development, benchmarking, and
consolidation of physical theories and numerical
implementations.
5
FAIR data are data which meet principles of findability, accessib-
ility, interoperability, and reusability.
The Computational 2D Materials Database
(C2DB) [6, 37] is a highly curated and fully open
database containing elementary physical properties
of around 4000 2D monolayer crystals. The data
has been generated by automatic high-throughput
calculations at the level of DFT and many-body
perturbation theory as implemented in the GPAW
[38, 39] electronic structure code. The computa-
tional workflow is constructed using the atomic sim-
ulation recipes (ASR) [40]—a recently developed
Python framework for high-throughput materials
modelling building on the atomic simulation envir-
onment (ASE) [41]—and managed/executed using
the MyQueue task scheduler [42].
The C2DB differentiates itself from existing
computational databases of bulk [4345] and low-
dimensional [11, 12, 4650] materials, by the large
number of physical properties available, see table 1.
The use of beyond-DFT theories for excited state
properties (GW band structures and Bethe–Salpeter
equation (BSE) absorption for selected materials) and
Berry-phase techniques for band topology and polar-
isation quantities (spontaneous polarisation, Born
charges, piezoelectric tensors), are other unique fea-
tures of the database.
The C2DB can be downloaded in its entirety or
browsed and searched online. As a new feature, all
data entries presented on the website are accom-
panied by a clickable help icon that presents a sci-
entific documentation (‘what does this piece of data
describe?’) and technical documentation (‘how was
this piece of data computed?’). This development
enhances the usability of the database and improves
the reproducibility and provenance of the data con-
tained in C2DB. As another novelty it is possible to
download all property data pertaining to a specific
material or a specific type of property, e.g. the band
gap, for all materials thus significantly improving data
accessibility.
In this paper, we report on the significant C2DB
developments that have taken place during the
past two years. These developments can be roughly
divided into four categories: (1) General updates
of the workflow used to select, classify, and stabil-
ity assess the materials. (2) Computational improve-
ments for properties already described in the 2018
paper. (3) New properties. (4) New materials. The
developments, described in four separate sections,
cover both original work and review of previously
published work. In addition, we have included some
outlook discussions of ongoing work. In the last
section we illustrate an application of statistical learn-
ing to predict properties directly from the atomic
structure.
2. Selection, classification, and stability
Figure 1 illustrates the workflow behind the C2DB. In
this section we describe the first part of the workflow
2
2D Mater. 8 (2021) 044002 M N Gjerding et al
Table 1. Properties calculated by the C2DB monolayer workflow. The computational method and the criteria used to decide whether the
property should be evaluation for a given material is also shown. A
indicates that spin–orbit coupling (SOC) is included. All
calculations are performed with the GPAW code using a plane wave basis except for the Raman calculations, which employ a double-zeta
polarised basis of numerical atomic orbitals [51].
Property Method Criteria Count
Bader charges PBE None 3809
Energy above convex hull PBE None 4044
Heat of formation PBE None 4044
Orbital projected band structure PBE None 2487
Out-of-plane dipole PBE None 4044
Phonons (Γ and BZ corners) PBE None 3865
Projected density of states PBE None 3332
Stiffness tensor PBE None 3968
Exchange couplings PBE Magnetic 538
Infrared polarisability PBE E
PBE
gap
> 0 784
Second harmonic generation PBE E
PBE
gap
> 0, non-magnetic,
non-centrosymmetric
375
Electronic band structure PBE PBE
None 3496
Magnetic anisotropies PBE
Magnetic 823
Deformation potentials PBE
E
PBE
gap
> 0 830
Effective masses PBE
E
PBE
gap
> 0 1272
Fermi surface PBE
E
PBE
gap
= 0 2505
Plasma frequency PBE
E
PBE
gap
= 0 3144
Work function PBE
E
PBE
gap
= 0 4044
Optical polarisability RPA@PBE None 3127
Electronic band structure HSE06@PBE
None 3155
Electronic band structure G
0
W
0
@PBE
E
PBE
gap
> 0, N
atoms
< 5 357
Born charges PBE, Berry phase E
PBE
gap
> 0 639
Raman spectrum PBE, LCAO basis set Non-magnetic, dyn. stable 708
Piezoelectric tensor PBE, Berry phase E
PBE
gap
, non-centrosym. 353
Optical absorbance BSE@G
0
W
0
E
PBE
gap
> 0, N
atoms
< 5 378
Spontaneous polarisation PBE, Berry phase E
PBE
gap
> 0, nearly centrosym.
polar space group
151
Topological invariants PBE
, Berry phase 0 < E
PBE
gap
< 0.3 eV 242
Figure 1. The workflow behind the C2DB. After the structural relaxation, the dimensionality of the material is checked and it is
verified that the material is not already present in the database. Next, the material is classified according to its chemical
composition, crystal structure, and magnetic state. Finally, the thermodynamic and dynamic stabilities are assessed from the
energy above the convex hull and the sign of the minimum eigenvalues of the dynamical matrix and stiffness tensor. Unstable
materials are stored in the database; stable materials are subject to the property workflow. The C2DB monolayer database is
interlinked with databases containing structures and properties of multilayer stacks and point defects in monolayers from the
C2DB.
3
2D Mater. 8 (2021) 044002 M N Gjerding et al
until the property calculations (red box), focusing
on aspects related to selection criteria, classification,
and stability assessment, that have been changed or
updated since the 2018 paper.
2.1. Structure relaxation
Given a prospective 2D material, the first step is to
carry out a structure optimisation. This calculation is
performed with spin polarisation and with the sym-
metries of the original structure enforced. The latter is
done to keep the highest level of control over the res-
ulting structure by avoiding ‘uncontrolled’ symmetry
breaking distortions. The prize to pay is a higher risk
of generating dynamically unstable structures.
2.2. Selection: dimensionality analysis
A dimensionality analysis [52] is performed to
identify and filter out materials that have disin-
tegrated into non-2D structures during relaxation.
Covalently bonded clusters are identified through an
analysis of the connectivity of the structures where
two atoms are considered to belong to the same
cluster if their distance is less than some scaling of
the sum of their covalent radii, i.e. d < k(r
cov
i
+ r
cov
j
),
where i and j are atomic indices. A scaling factor
of k = 1.35 was determined empirically. Only struc-
tures that consist of a single 2D cluster after relaxation
are further processed. Figure 2 shows three examples
(graphene, Ge
2
Se
2
, and Pb
2
O
6
) of structures and
their cluster dimensionalities before and after relax-
ation. All structures initially consist of a single 2D
cluster, but upon relaxation Ge
2
Se
2
and Pb
2
O
6
disin-
tegrate into two 2D clusters as well as one 2D and two
0D clusters, respectively. On the other hand, the relax-
ation of graphene decreases the in-plane lattice con-
stant but does not affect the dimensionality. Accord-
ing to the criterion defined above only graphene will
enter the database.
2.3. Selection: ranking similar structures
Maintaining a high-throughput database inevitably
requires a strategy for comparing similar structures
and ranking them according to their relevance. In par-
ticular, this is necessary in order to identify differ-
ent representatives of the same material e.g. result-
ing from independent relaxations, and thereby avoid
duplicate entries and redundant computations. The
C2DB strategy to this end involves a combination of
structure clustering and Pareto analysis.
First, a single-linkage clustering algorithm is used
to group materials with identical reduced chem-
ical formula and ‘similar’ atomic configurations. To
quantify configuration similarity a slightly modi-
fied version of PyMatGens [53] distance metric is
employed where the cell volume normalisation is
removed to make it applicable to 2D materials sur-
rounded by vacuum. Roughly speaking, the metric
measures the maximum distance an atom must be
moved (in units of Å) in order to match the two
Figure 2. Three example structures from C2DB (top:
graphene, middle: Ge
2
Se
2
, bottom: Pb
2
O
6
) with their
respective cluster dimensionalities cluster before (left) and
after (right) relaxation. The number N
xD
denotes the
number of clusters of dimensionality x. Note that the
number of atoms of the structures depicted in the left and
right columns can differ because the relaxation can change
the lattice constants.
atomic configurations. Two atomic configurations
belong to the same cluster if their distance is below
an empirically determined threshold of 0.3 Å.
At this point, the simplest strategy would be to
remove all but the most stable compound within a
cluster. However, this procedure would remove many
high symmetry crystals for which a more stable dis-
torted version exists. For example, the well known
T-phase of MoS
2
would be removed in favour of
the more stable T
-phase. This is undesired as high-
symmetry structures, even if dynamically unstable at
T = 0, may provide useful information and might in
fact become stabilised at higher temperatures [54].
Therefore, the general strategy adopted for the C2DB,
4
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 3. Illustration of the Pareto analysis used to filter out duplicates or irrelevant structures from the C2DB. All points
represent materials with the same reduced chemical formula (in this case ReS
2
) that belong to the same cluster defined by the
structure metric. Only structures lying on the (N, H)-Pareto front are retained (black circles) while other materials are excluded
(red circles). The philosophy behind the algorithm is to keep less stable materials if they contain fewer atoms per unit cell than
more stable materials and thus represent structures of higher symmetry.
is to keep a material that is less stable than another
material of the same cluster if it has fewer atoms in
its primitive unit cell (and thus typically higher sym-
metry). Precisely, materials within a given cluster are
kept only if they represent a defining point of the (N,
H)-Pareto front, where N is the number of atoms
in the unit cell and H is the heat of formation. A
graphical illustration of the Pareto analysis is shown
in figure 3 for the case of ReS
2
.
2.4. Classification: crystal structure
The original C2DB employed a crystal prototype clas-
sification scheme where specific materials were pro-
moted to prototypes and used to label groups of
materials with the same or very similar crystal struc-
ture. This approach was found to be difficult to
maintain (as well as being non-transparent). Instead,
materials are now classified according to their crys-
tal type defined by the reduced stoichiometry, space
group number, and the alphabetically sorted labels of
the occupied Wyckoff positions. As an example, MoS
2
in the H-phase has the crystal type: AB2-187-bi.
2.5. Classification: magnetic state
In the new version of the C2DB, materials are classi-
fied according to their magnetic state as either non-
magnetic or magnetic. A material is considered mag-
netic if any atom has a local magnetic moment greater
than 0.1 µ
B
.
In the original C2DB, the magnetic category was
further subdivided into ferromagnetic (FM) and anti-
ferromagnetic (AFM). But since the simplest anti-
ferromagnetically ordered state typically does not
represent the true ground state, all material entries
with an AFM state have been removed from the
C2DB and replaced by the material in its FM state.
Although the latter is less stable, it represents a
more well defined state of the material. Crucially, the
nearest neighbour exchange couplings for all mag-
netic materials have been included in the C2DB (see
section 5.8). This enables a more detailed and realistic
description of the magnetic order via the Heisenberg
model. In particular, the FM state of a material is not
expected to represent the true magnetic ground if the
exchange coupling J < 0.
2.6. Stability: thermodynamic
The heat of formation, H, of a compound is defined
as its energy per atom relative to its constituent ele-
ments in their standard states [55]. The thermody-
namic stability of a compound is evaluated in terms of
its energy above the convex hull, H
hull
, which gives
the energy of the material relative to other compet-
ing phases of the same chemical composition, includ-
ing mixed phases [6], see figure 4 for an example.
Clearly, H
hull
depends on the pool of reference
phases, which in turn defines the convex hull. The
original C2DB employed a pool of reference phases
comprised by 2807 elemental and binary bulk crys-
tals from the convex hull of the Open Quantum
Materials Database (OQMD) [55]. In the new ver-
sion, this set has been extended by approximately
6783 ternary bulk compounds from the convex hull of
OQMD, making a total of 9590 stable bulk reference
compounds.
As a simple indicator for the thermodynamic
stability of a material, the C2DB employs three
labels (low, medium, high) as defined in table 2.
These indicators are unchanged from the original
version of the C2DB. In particular, the criterion
H
hull
< 0.2 eV atom
1
, defining the most stable
category, was established based on an extensive
analysis of 55 experimentally realised monolayer
crystals [6].
5
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 4. Convex hull diagram for (Bi,I,Te)-compounds.
Green (red) colouring indicate materials that have a convex
hull energy of less than (greater than) 5 meV. The
monolayers BiI
3
, Bi
2
Te
3
and BiITe lie on the convex hull.
The monolayers are degenerate with their layered bulk
parent because the vdW interactions are not captured by
the PBE xc-functional.
Table 2. Thermodynamic stability indicator assigned to all
materials in the C2DB. H and H
hull
denote the heat of
formation and energy above the convex hull, respectively.
Thermodynamic stability
indicator Criterion (eV atom
1
)
Low H > 0.2
Medium H < 0.2 and H
hull
> 0.2
High H < 0.2 and H
hull
< 0.2
It should be emphasised that the energies of
both monolayers and bulk reference crystals are
calculated with the Perdew-Burke-Ernzerhof (PBE)
xc-functional [56]. This implies that some inac-
curacies must be expected, in particular for mater-
ials with strongly localised d-electrons, e.g. certain
transition metal oxides, and materials for which
dispersive interactions are important, e.g. layered
van der Waals crystals. The latter implies that the
energy of a monolayer and its layered bulk parent (if
such exists in the pool of references) will have the
same energy. For further details and discussions see
reference [6].
2.7. Stability: dynamical
Dynamically stable materials are situated at a local
minimum of the potential energy surface and are thus
stable to small structural perturbations. Structures
resulting from DFT relaxations can end up in saddle
point configurations because of imposed symmetry
constraints or an insufficient number of atoms in the
unit cell.
In C2DB, the dynamical stability is assessed from
the signs of the minimum eigenvalues of (1) the
stiffness tensor (see section 3.1) and (2) the Γ-point
Hessian matrix for a supercell containing 2 ×2 repe-
titions of the unit cell (the structure is not relaxed in
the 2 ×2 supercell). If one of these minimal eigen-
values is negative the material is classified as dynam-
ically unstable. This indicates that the energy can be
reduced by displacing an atom and/or deforming the
unit cell, respectively. The use of two categories for
dynamical stability, i.e. stable/unstable, differs from
the original version of the C2DB where an interme-
diate category was used for materials with negative
but numerically small minimal eigenvalue of either
the Hessian or stiffness tensors.
3. Improved property methodology
The new version of the C2DB has been generated
using a significantly extended and improved work-
flow for property evaluations. This section focuses on
improvements relating to properties that were already
present in the original version of the C2DB while new
properties are discussed in the next section.
3.1. Stiffness tensor
The stiffness tensor, C, is a rank-4 tensor that relates
the stress of a material to the applied strain. In Mandel
notation (a variant of Voigt notation) C is expressed
as an N ×N matrix relating the N independent com-
ponents of the stress and strain tensors. For a 2D
material N = 3 and the tensor takes the form:
C =
C
xxxx
C
xxyy
2C
xxxy
C
xxyy
C
yyyy
2C
yyxy
2C
xxxy
2C
yyxy
2C
xyxy
, (1)
where the indices on the matrix elements refer to the
rank-4 tensor. The factors multiplying the tensor ele-
ments account for their multiplicities in the full rank-
4 tensor. In the C2DB workflow, C is calculated as a
finite difference of the stress under an applied strain
with full relaxation of atomic coordinates. A negat-
ive eigenvalue of C signals a dynamical instability, see
section 2.7.
In the first version of the C2DB only the diagonal
elements of the stiffness tensor were calculated. The
new version also determines the shear components
such that the full 3 ×3 stiffness tensor is now avail-
able. This improvement also leads to a more accurate
assessment of dynamical stability [57].
3.2. Effective masses with parabolicity estimates
For all materials with a finite band gap the effective
masses of electrons and holes are calculated for bands
within 100 meV of the conduction band minimum
and valence band maximum, respectively. The Hes-
sian matrices at the band extrema (BE) are determ-
ined by fitting a second order polynomium to the
PBE band structure including SOC, and the effective
masses are obtained by subsequent diagonalisation of
the Hessian. The main fitting-procedure is unaltered
6
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 5. Left: The PBE band structures of Rh
2
Br
6
and MoS
2
(coloured dots) in regions around the conduction band minimum.
The dashed red line shows the fit made to estimate the effective masses of the lowest conduction band. The shaded grey region
highlights the error between the fit and the true band structure. The mean absolute relative error (MARE) discussed in the main
text is calculated for energies within 25 meV of the band minimum. For MoS
2
the fit is essentially on top of the band energies.
Right: The distribution of the MARE of all effective mass fits in the C2DB. The inset shows the full distribution on a log scale. As
mentioned in the main text, very large MAREs indicate that the band minimum/maximum was incorrectly identified by the
algorithm and/or that the band is very flat. Only three materials have MAREs
>
1000% but these each have several bands for
which the fit fails.
from the first version of C2DB, but two important
improvements have been made.
The first improvement consists in an additional k-
mesh refinement step for better localisation of the BE
in the Brillouin zone. After the location of the BE has
been estimated based on a uniformly sampled band
structure with k-point density of 12 Å, another one-
shot calculation is performed with a denser k-mesh
around the estimated BE positions. This ensures a
more accurate and robust determination of the loc-
ation of the BE, which can be important in cases
with a small but still significant spin–orbit splitting or
when the band is very flat or non-quadratic around
the BE. The second refinement step is the same as
in the first version of C2DB, i.e. the band energies
are calculated on a highly dense k-mesh in a small
disc around the BE, and the Hessian is obtained by
fitting the band energies in the range up to 1 meV
from the BE.
The second improvement is the calculation of the
mean absolute relative error (MARE) of the polyno-
mial fit in a 25 meV range from the BE. The value of
25 meV corresponds to the thermal energy at room
temperature and is thus the relevant energy scale for
many applications. To make the MARE independent
of the absolute position of the band we calculate the
average energy of the band over the 25 meV and com-
pare the deviation of the fit to this energy scale. The
MARE provides a useful measure of the parabolicity
of the energy bands and thus the validity of the effect-
ive mass approximation over this energy scale.
Figure 5 shows two examples of band struc-
tures with the effective mass fits and corresponding
fit errors indicated. Additionally, the distribution of
MARE for all the effective mass fits in the C2DB
are presented. Most materials have an insignificant
MARE, but a few materials have very large errors.
Materials with a MARE above a few hundreds of per-
centages fall into two classes. For some materials the
algorithm does not correctly find the position of the
BE. An example is Ti
2
S
2
in the space group C2/m. For
others, the fit and BE location are both correct, but
the band flattens away from the BE which leads to a
large MARE as is the case for Rh
2
Br
6
shown in the
figure or Cl
2
Tl
2
in the space group P-1. In general a
small MARE indicates a parabolic band while materi-
als with large MARE should be handled on a case-by-
case basis.
3.3. Orbital projected band structure
To facilitate a state-specific analysis of the PBE Kohn–
Sham wave functions, an orbital projected band
structure (PBS) is provided to complement the pro-
jected density of states (PDOS). In the PAW meth-
odology, the all-electron wave functions are projec-
ted onto atomic orbitals inside the augmentation
spheres centred at the position of each atom. The
PBS resolves these atomic orbital contributions to the
7
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 6. Orbital projected band structure and orbital projected density of states of MoS
2
in the H-phase. The pie chart symbols
indicate the fractional atomic orbital character of the Kohn–Sham wave functions.
wave functions as a function of band and k-point
whereas the PDOS resolves the atomic orbital char-
acter of the total density of states as a function of
energy. The SOC is not included in the PBS or PDOS,
as its effect is separately visualised by the spin-PBS
also available in the C2DB.
As an example, figure 6 shows the PBS (left) and
PDOS (right) of monolayer MoS
2
calculated with
PBE. The relative orbital contribution to a given Bloch
state is indicated by a pie chart symbol. In the present
example, one can deduce from the PBS that even
though Mo-p orbitals and S-p orbitals contribute
roughly equally to the DOS in the valence band, the
Mo-p orbital contributions are localised to a region in
the BZ around the M-point, whereas the S-p orbitals
contribute throughout the entire BZ.
3.4. Corrected G
0
W
0
band structures
The C2DB contains G
0
W
0
quasiparticle (QP) band
structures of 370 monolayers covering 14 different
crystal structures and 52 chemical elements. The
details of these calculations can be found in the ori-
ginal C2DB paper [6]. A recent in-depth analysis of
the 61.716 G
0
W
0
data points making up the QP band
structures led to several important conclusions relev-
ant for high-throughput G
0
W
0
calculations. In par-
ticular, it identified the linear QP approximation as
a significant error source in standard G
0
W
0
calcu-
lations and proposed an extremely simple correc-
tion scheme (the empirical Z (empZ) scheme), that
reduces this error by a factor of two on average.
The empZ scheme divides the electronic states
into two classes according to the size of the QP
weight, Z. States with Z [0.5, 1.0] are classified as
QP consistent (QP-c) while states with Z ̸∈[0.5, 1.0]
are classified as QP inconsistent (QP-ic). With this
definition, QP-c states will have at least half of their
spectral weight in the QP peak. The distribution of
the 60.000+ Z-values is shown in figure 7. It turns
out that the linear approximation to the self-energy,
which is the gist of the QP approximation, introduces
significantly larger errors for QP-ic states than for
QP-c states. Consequently, the empZ method replaces
the calculated Z of QP-ic states with the mean of the
Z-distribution, Z
0
0.75. This simple replacement
reduces the average error of the linear approximation
from 0.11 to 0.06 eV.
An illustration of the method applied to MoS
2
is
shown in figure 7. The original uncorrected G
0
W
0
band structure is shown in blue while the empZ cor-
rected band structure is shown in orange. MoS
2
has
only one QP-ic state in the third conduction band at
the K-point. Due to a break-down of the QP approx-
imation for this state, the G
0
W
0
correction is greatly
overestimated leading to a local discontinuity in the
band structure. The replacement of Z by Z
0
for this
particular state resolves the problem. All G
0
W
0
band
structures in the C2DB are now empZ corrected.
3.5. Optical absorbance
In the first version of the C2DB, the optical absorb-
ance was obtained from the simple expression [6]
A(ω)
ωImα
2D
(ω)
ϵ
0
c
, (2)
where α
2D
is the long wavelength limit of the in-plane
sheet polarisability density (note that the equation
is written here in SI units). The sheet polarisabil-
ity is related to the sheet conductivity via σ
2D
(ω) =
iωα
2D
(ω). The expression (2) assumes that the elec-
tric field inside the layer equals the incoming field (i.e.
reflection is ignored), and hence, it may overestimate
the absorbance.
In the new version, the absorbance is evaluated
from A = 1 R T, where R and T are the reflected
and transmitted powers of a plane wave at normal
8
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 7. Top: Distribution of the 61 716 QP weights (Z)
contained in the C2DB. The blue part of the distribution
shows QP-consistent (QP-c) Z-values while the orange part
shows QP-inconsistent (QP-ic) Z values. In general, the
linear expansion of the self-energy performed when solving
the QP equation works better for Z closer to 1. About 0.3%
of the Z-values lie outside the interval from 0 to 1 and are
not included in the distribution. Bottom: G
0
W
0
band
structure before (blue) and after (orange) applying the
empZ correction, which replaces Z by the mean of the
distribution for QP-ic states. In the case of MoS
2
only one
state at K is QP-ic.
incidence, respectively. These can be obtained from
the conventional transfer matrix method applied to
a monolayer suspended in vacuum. The 2D mater-
ial is here modelled as an infinitely thin layer with a
sheet conductivity. Alternatively, it can be modelled
as quasi-2D material of thickness d with a ‘bulk’ con-
ductivity of σ = σ
2D
/d [58], but the two approaches
yield very similar results, since the optical thickness
of a 2D material is much smaller than the optical
wavelength. Within this model, the expression for the
absorbance of a suspended monolayer with the sheet
conductivity σ
2D
reads:
A(ω) = Re
σ
2D
(ω)η
0
2
2 + σ
2D
(ω)η
0
2
, (3)
where η
0
= 1/(ϵ
0
c) 377 is the vacuum imped-
ance.
If the light–matter interaction is weak, i.e.
|σ
2D
η
0
| 1, equation (3) reduces to equation (2).
Nonetheless, due the strong light–matter interaction
in some 2D materials, this approximation is not reli-
able in general. In fact, it can be shown that the max-
imum possible absorption from equation (3) is 50%,
which is known as the upper limit of light absorp-
tion in thin films [59]. This limit is not guaranteed
by equation (2), which can even yield an absorbance
above 100%.
As an example, figure 8 shows the absorption
spectrum of monolayer MoS
2
for in- and out-of-
plane polarised light as calculated with the exact
equation (3) and the approximate equation (2),
respectively. In all cases the sheet polarisability is
obtained from the BSE to account for excitonic effects
[6]. For weak light–matter interactions, e.g. for the z-
polarised light, the two approaches agree quite well,
but noticeable differences are observed in regions
with stronger light–matter interaction.
4. New materials in the C2DB
In this section we discuss the most significant exten-
sions of the C2DB in terms of new materials. The
set of materials presented here is not complete, but
represents the most important and/or well defined
classes. The materials discussed in sections 4.1 and
4.2 (MXY Janus monolayers and monolayers extrac-
ted from experimental crystal structure databases)
are already included in the C2DB. The materials
described in sections 4.3 and 4.4 (homo-bilayers and
monolayer point defect systems) will soon become
available as separate C2DB-interlinked databases.
4.1. MXY Janus monolayers
The class of TMDC monolayers of the type MX
2
(where M is the transition metal and X is a chalco-
gen) exhibits a large variety of interesting and unique
properties and has been widely discussed in the liter-
ature [60]. Recent experiments have shown that it is
not only possible to synthesise different materials by
changing the metal M or the chalcogen X, but also by
exchanging the X on one side of the layer by another
chalcogen (or halogen) [6163]. This results in a class
of 2D materials known as MXY Janus monolayers
with broken mirror symmetry and finite out-of-plane
dipole moments. The prototypical MXY crystal struc-
tures are shown in figure 9 for the case of MoSSe
and BiTeI, which have both been experimentally real-
ised [6163]. Adopting the nomenclature from the
TMDCs, the crystal structures are denoted as H- or
T-phase, depending on whether X and Y atoms are
vertically aligned or displaced, respectively.
In a recent work [64], the C2DB workflow was
employed to scrutinise and classify the basic elec-
tronic and optical properties of 224 different MXY
Janus monolayers. All data from the study is avail-
able in the C2DB. Here we provide a brief discussion
of the Rashba physics in these materials and refer the
9
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 8. Optical absorption of standalone monolayer MoS
2
for x/y-polarisation (left) and z-polarisation (right) at normal
incident in the BSE framework, obtained using equation (2) (blue) or equation (3) (orange). The crystal structure cross-sectional
views are shown in the inset with the definition of directions.
Figure 9. Atomic structure of the MXY Janus monolayers in
the H-phase (left) and T-phase (right). The two prototype
materials MoSSe and BiTeI are examples of experimentally
realised monolayers adopting these crystal structures (not
to scale).
interested reader to [64] for more details and analysis
of other properties.
A key issue when considering hypothetical mater-
ials, i.e. materials not previously synthesised, is their
stability. The experimentally synthesised MoSSe and
BiTeI are both found to be dynamically stable and
lie within 10 meV of the convex hull confirming
their thermodynamic stability. Out of the 224 ini-
tial monolayers 93 are classified as stable according to
the C2DB criteria (dynamically stable and H
hull
<
0.2 eV atom
1
). Out of the 93 stable materials, 70
exhibit a finite band gap when computed with the
PBE xc-functional.
The Rashba effect is a momentum dependent
splitting of the band energies of a 2D semiconductor
in the vicinity of a band extremum arising due to
the combined effect of spin–orbit interactions and
a broken crystal symmetry in the direction perpen-
dicular to the 2D plane. The simplest model used to
describe the Rashba effect is a 2D electron gas in a per-
pendicular electric field (along the z-axis). Close to
the band extremum, the energy of the two spin bands
is described by the Rashba Hamiltonian [65, 66]:
H = α
R
(σ × k) ·
ˆ
e
z
, (4)
where σ is the vector of Pauli matrices, k = p/ is the
wave number, and the Rashba parameter is propor-
tional to the electric field strength, α
R
E
0
.
Although the Rashba Hamiltonian is only meant
as a qualitative model, it is of interest to test its valid-
ity on the Janus monolayers. The electric field of
the Rashba model is approximately given by E
0
=
V
vac
/d, where V
vac
is the shift in vacuum potential
on the two sides of the layer (see left inset of figure 10)
and d is the layer thickness. Assuming a similar thick-
ness for all monolayers, the electric field is propor-
tional to the potential shift. Not unexpected, the lat-
ter is found to correlate strongly with the difference in
electronegativity of the X and Y atoms, see left panel
of figure 10.
The Rashba energy, E
R
, can be found by fitting
E(k) =
2
k
2
/2m
±α
R
k to the band structure (see
right inset of figure 10) and should scale with the elec-
tric field strength. However, as seen from the right
panel of figure 10, there is no correlation between the
two quantities. Hence we conclude that the simple
Rashba model is completely inadequate and that the
strength of the perpendicular electric field cannot be
used to quantify the effect of spin–orbit interactions
on band energies.
4.2. Monolayers from known layered bulk crystals
The C2DB has been extended with a number of
monolayers that are likely exfoliable from experi-
mentally known layered bulk compounds. Specific-
ally, the Inorganic Crystal Structure Database (ICSD)
[67] and Crystallography Open Database (COD) [68]
10
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 10. Left: Correlation between the electronegativity difference of X and Y in MXY Janus monolayers and the vacuum level
shift across the layer. Right: Correlation between the Rashba energy and the vacuum level shift. Structures in the H-phase (e.g.
MoSSe) are shown in black while structures in the T-phase (e.g. BiTeI) are shown in orange. The linear fit has the slope
1.35 eV/χ (Pauling scale). The insets show the definition of the vacuum level shift and the Rashba energy, respectively.
Modified from [64].
have first been filtered for corrupted, duplicate and
theoretical compounds, which reduce the initial set
of 585.485 database entries to 167.767 unique mater-
ials. All of these have subsequently been assigned a
dimensionality score based on a purely geometrical
descriptor. If the 2D score is larger than the sum of
0D, 1D and 3D scores we regard the material as being
exfoliable and we extract the individual 2D compon-
ents that comprise the material (see also section 2.2).
We refer to the original work on the method for
details [52] and note that similar approaches were
applied in [11, 12] to identify potentially exfoliable
monolayers from the ICSD and COD.
The search has been limited to bulk compounds
containing less than six different elements and no
rare earth elements. This reduces the set of relevant
bulk materials to 2991. For all of these we extracted
the 2D components containing less than 21 atoms
in the unit cell, which were then relaxed and sorted
for duplicates following the general C2DB workflow
steps described in sections 2.12.3. At this point 781
materials remain. This set includes most known 2D
materials and 207 of the 781 were already present
in the C2DB prior to this addition. All the materi-
als (including those that were already in C2DB) have
been assigned an ICSD/COD identifier that refers to
the parent bulk compound from which the 2D mater-
ial was computationally exfoliated. We emphasise that
we have not considered exfoliation energies in the
analysis and a subset of these materials may thus be
rather strongly bound and challenging to exfoliate
even if the geometries indicate van der Waals bonded
structures of the parent bulk compounds.
Figure 11 shows the distribution of energies
above the convex hull for materials derived from
parent structures in ICSD or COD as well as for the
entire C2DB, which includes materials obtained from
combinatorial lattice decoration as well. As expected,
the materials derived from experimental bulk materi-
als are situated rather close to the convex hull whereas
those obtained from lattice decoration extend to ener-
gies far above the convex hull. It is also observed that
a larger fraction of the experimentally derived mater-
ials are dynamically stable. There are, however, well
known examples of van der Waals bonded structures
where the monolayer undergoes a significant lattice
distortion, which will manifest itself as a dynamical
instability in the present context. For example, bulk
MoS
2
exists in van der Waals bonded structures com-
posed of either 2 H-MoS
2
or 1 T-MoS
2
layers, but a
monolayer of the 1 T phase undergoes a structural
deformation involving a doubling of the unit cell [69]
and is thus categorised as dynamically unstable by
the C2DB workflow. The dynamically stable materials
derived from parent bulk structures in the ICSD and
COD may serve as a useful subset of the C2DB that
are likely to be exfoliable from known compounds
and thus facilitate experimental verification. As a first
application the subset has been used to search for
magnetic 2D materials, which resulted in a total of 85
ferromagnets and 61 anti-ferromagnets [70].
4.3. Outlook: multilayers
The C2DB is concerned with the properties of cova-
lently bonded monolayers (see discussion of dimen-
sionality filtering in section 2.2). However, multilayer
structures composed of two or more identical mono-
layers are equally interesting and often have prop-
erties that deviate from those of the monolayer. In
fact, the synthesis of layered vdW structures with a
11
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 11. Distribution of energies above the convex hull
for the 2D materials extracted from bulk compounds in
ICSD and COD (top) and for the entire C2DB including
those constructed from combinatorial lattice decoration
(bottom). Dynamically stable materials are indicated in
blue.
controllable number of layers represents an interest-
ing avenue for atomic-scale materials design. Several
examples of novel phenomena emerging in layered
vdW structures have been demonstrated including
direct-indirect band gap transitions in MoS
2
[71, 72],
layer-parity selective Berry curvatures in few-layer
WTe
2
[73], thickness-dependent magnetic order in
CrI
3
[74, 75], and emergent ferroelectricity in bilayer
hBN [76].
As a first step towards a systematic exploration of
multilayer 2D structures, the C2DB has been used as
basis for generating homobilayers in various stack-
ing configurations and subsequently computing their
properties following a modified version of the C2DB
monolayer workflow. Specifically, the most stable
monolayers (around 1000) are combined into bilay-
ers by applying all possible transformations (unit
cell preserving point group operations and transla-
tions) of one layer while keeping the other fixed. The
candidate bilayers generated in this way are subject
to a stability analysis, which evaluates the binding
energy and optimal IL distance based on PBE-D3 [77]
total energy calculations keeping the atoms of the
monolayers fixed in their PBE relaxed geometry, see
figures 12 and table 3.
Figure 12. An illustration of the optimisation of the
interlayer (IL) distance for MoS
2
in the AA stacking. The
black crosses are the points sampled by the optimisation
algorithm while the blue curve is a spline interpolation of
the black crosses. The inset shows the MoS
2
AA stacking
and the definition of the IL distance is indicated with a
black double-sided arrow.
Table 3. Exfoliation energies for selected materials calculated with
the PBE+D3 xc-functional as described in section 4.3 and
compared with the DF2 and rVV10 results from [11]. The
spacegroups are indicated in the column ‘SG’. All numbers are in
units of meV Å
2
.
Material SG PBE + D3 DF2 rVV10
MoS
2
P-6m2 28.9 21.6 28.8
MoTe
2
P-6m2 30.3 25.2 30.4
ZrNBr Pmmn 18.5 10.5 18.5
C P6/mmm 18.9 20.3 25.5
P Pmna 21.9 38.4 30.7
BN P-6m2 18.9 19.4 24.4
WTe
2
P-6m2 32.0 24.7 30.0
PbTe P3m1 23.2 27.5 33.0
The calculated IL binding energies are generally in
the range from a few to a hundred meV Å
2
and IL
distances range from 1.5 to 3.8 Å. A scatter plot of pre-
liminary binding energies and IL distances is shown
in figure 13. The analysis of homobilayers provides an
estimate of the energy required to peel a monolayer
off a bulk structure. In particular, the binding energy
for the most stable bilayer configuration provides a
measure of the exfoliation energy of the monolayer.
This key quantity is now available for all monolayers
in the C2DB, see section 5.1.
4.4. Outlook: point defects
The C2DB is concerned with the properties of 2D
materials in their pristine crystalline form. How-
ever, as is well known the perfect crystal is an ideal-
ised model of real materials, which always contain
defects in smaller or larger amounts depending on
the intrinsic materials properties and growth condi-
tions. Crystal defects often have a negative impact on
12
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 13. Scatter plot of the calculated interlayer distance
and binding energies of (homo)bilayers of selected
materials from C2DB. A few well known materials are
highlighted: MoS
2
, graphene (C
2
), and hexagonal boron
nitride (hBN). The bilayer binding energies provide an
estimate of the monolayer exfoliation energies, see
section 5.1.
physical properties, e.g. they lead to scattering and life
time-reduction of charge carriers in semiconductors.
However, there are also important situations where
defects play a positive enabling role, e.g. in doping of
semiconductors, as colour centres for photon emis-
sion [78, 79] or as active sites in catalysis.
To reduce the gap between the pristine model
material and real experimentally accessible samples,
a systematic evaluation of the basic properties of the
simplest native point defects in a selected subset of
monolayers from the C2DB has been initiated. The
monolayers are selected based on the stability of the
pristine crystal. Moreover, only non-magnetic semi-
conductors with a PBE band gap satisfying E
gap
>
1 eV are currently considered as such materials are
candidates for quantum technology applications like
single-photon sources and spin qubits. Following
these selection criteria around 300 monolayers are
identified and their vacancies and intrinsic substitu-
tional defects are considered, yielding a total of about
1500 defect systems.
Each defect system is subject to the same work-
flow, which is briefly outlined below. To enable point
defects to relax into their lowest energy configuration,
the symmetry of the pristine host crystal is intention-
ally broken by the chosen supercell, see figure 14 (a).
In order to minimise defect–defect interaction, super-
cells are furthermore chosen such that the minimum
distance between periodic images of defects is larger
than 15 Å. Unique point defects are created based on
the analysis of equivalent Wyckoff positions for the
host material. To illustrate some of the properties that
will feature in the upcoming point defect database, we
consider the specific example of monolayer CH
2
Si.
First, the formation energy [80, 81] of a given
defect is calculated from PBE total energies. Next,
Slater–Janak transition state theory is used to obtain
the charge transition levels [82, 83]. By combining
these results, one obtains the formation energy of
the defect in all possible charge states as a function
of the Fermi level. An example of such a diagram is
shown in figure 14 (b) for the case of the V
C
and C
Si
defects in monolayer CH
2
Si. For each defect and each
charge state, the PBE single-particle energy level dia-
gram is calculated to provide a qualitative overview of
the electronic structure. A symmetry analysis [84] is
performed for the defect structure and the individual
defect states lying inside the band gap. The energy
level diagram of the neutral V
Si
defect in CH
2
Si is
shown in figure 14 (c), where the defect states are
labelled according to the irreducible representations
of the C
s
point group.
In general, excited electronic states can be mod-
elled by solving the Kohn–Sham equations with non-
Aufbau occupations. The excited-state solutions are
saddle points of the Kohn–Sham energy functional,
but common self-consistent field (SCF) approaches
often struggle to find such solutions, especially when
nearly degenerate states are involved. The calcula-
tion of excited states corresponding to transitions
between localised states inside the band gap is there-
fore performed using an alternative method based
on the direct optimisation (DO) of orbital rotations
in combination with the maximum overlap method
(MOM) [85]. This method ensures fast and robust
convergence of the excited states, as compared to SCF.
In figure 14 (d), the reorganisation energies for the
ground and excited state, as well as the zero-phonon
line (ZPL) energy are sketched. For the specific case of
the Si vacancy in CH
2
Si, the DO-MOM method yields
E
ZPL
= 3.84 eV, λ
reorg
gs
= 0.11 eV and λ
reorg
exc
= 0.16 eV.
For systems with large electron-phonon coupling (i.e.
Huang–Rhys factor > 1) a one-dimensional approx-
imation for displacements along the main phonon
mode is used to produce the configuration coordin-
ate diagram (see figure 14 (d)). In addition to the ZPL
energies and reorganisation energies, the Huang–
Rhys factors, photoluminescence spectrum from the
1D phonon model, hyperfine coupling and zero field
splitting are calculated.
5. New properties in the C2DB
This section reports on new properties that have
become available in the C2DB since the first release.
The employed computational methodology is
described in some detail and results are compared
to the literature where relevant. In addition, some
interesting property correlations are considered along
with general discussions of the general significance
and potential application of the available data.
5.1. Exfoliation energy
The exfoliation energy of a monolayer is estimated as
the binding energy of its bilayer in the most stable
13
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 14. Overview of some of the properties included in the 2D defect database project for the example host material CH
2
Si.
(a) The supercell used to represent the defects (here a Si vacancy). The supercell is deliberately chosen to break the symmetry of
the host crystal lattice. (b) Formation energies of a C vacancy (green) and C–Si substitutional defect (purple). (c) Energy and
orbital symmetry of the localised single-particle states of the V
Si
defect for both spin channels (left and right). The Fermi level is
shown by the dotted line. (d) Schematic excited state configuration energy diagram. The transitions corresponding to the vertical
absorption and the zero-phonon emission are indicated.
stacking configuration (see also section 4.3). The
binding energy is calculated using the PBE + D3 xc-
functional [86] with the atoms of both monolayers
fixed in the PBE relaxed geometry. Table 3 compares
exfoliation energies obtained in this way to values
from Mounet et al [11] for a representative set of
monolayers.
5.2. Bader charges
For all monolayers we calculate the net charge on the
individual atoms using the Bader partitioning scheme
[87]. The analysis is based purely on the electron
density, which we calculate from the PAW pseudo
density plus compensation charges using the PBE xc-
functional. Details of the method and its implement-
ation can be found in Tang et al [88]. In section 5.4
we compare and discuss the relation between Bader
charges and Born charges.
5.3. Spontaneous polarisation
The spontaneous polarisation (P
s
) of a bulk mater-
ial is defined as the charge displacement with respect
to that of a reference centrosymmetric structure
[89, 90]. Ferroelectric materials exhibit a finite value
of P
s
that may be switched by an applied external field
and have attracted a large interest for a wide range of
applications [9193].
The spontaneous polarisation in bulk materials
can be regarded as electric dipole moment per unit
volume, but in contrast to the case of finite systems
this quantity is ill-defined for periodic crystals [89].
Nevertheless, one can define the formal polarisation
density:
P =
1
2
π
e
V
X
l
ϕ
l
a
l
, (5)
where a
l
(with l {1,2,3 }) are the lattice vectors
spanning the unit cell, V is the cell volume and e is the
elementary charge. ϕ
l
is the polarisation phase along
the lattice vector defined by:
ϕ
l
=
X
i
Z
i
b
l
·u
i
ϕ
elec
l
, (6)
where b
l
is the reciprocal lattice vector satisfying b
l
·
R
l
= 2π and u
i
is the position of nucleus i with charge
eZ
i
. The electronic contribution to the polarisation
phase is defined as:
14
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 15. Depicted in the blue plot is the formal
polarisation calculated along the adiabatic path for GeSe,
using the methods described in the main text. The orange
plot shows the energy potential along the path as well as
outside. Figure inset: The structure of GeSe in the two
non-centrosymmetric configurations corresponding to
P
s
and P
s
and the centrosymmetric configuration.
ϕ
elec
l
=
1
N
kb
l
Im
X
kBZ
b
l
×ln
N
kb
l
1
Y
j=0
det
occ
u
nk+jδ k
u
mk+( j+1)δk
,
(7)
where BZ
b
l
= {k|k ·b
l
= 0} is a plane of k-points
orthogonal to b
l
, δk is the distance between neigh-
bouring k-points in the b
l
direction and N
kb
l
(N
kb
l
)
is the number of k-points along (perpendicular to)
the b
l
direction. These expression generalise straight-
forwardly to 2D.
The formal polarisation is only well-defined mod-
ulo eR
n
/V where R
n
is any lattice vector. However,
changes in polarisation are well defined and the spon-
taneous polarisation may thus be obtained by:
P
s
=
ˆ
1
0
dP(λ)
dλ
dλ, (8)
where λ is a dimensionless parameter that defines an
adiabatic structural path connecting the polar phase
(λ = 1) with a non-polar phase (λ = 0).
The methodology has been implemented in
GPAW and used to calculate the spontaneous polar-
isation of all stable materials in the C2DB with a PBE
band gap above 0.01 eV and a polar space group
symmetry. For each material, the centrosymmetric
phase with smallest atomic displacement from the
polar phase is constructed and relaxed under the con-
straint of inversion symmetry. The adiabatic path
connecting the two phases is then used to calculate the
spontaneous polarisation using equations (5)–(8). An
example of a calculation for GeSe is shown in figure 15
where the polarisation along the path connecting
two equivalent polar phases via the centrosymmet-
ric phase is shown together with the total energy. The
spontaneous polarisation obtained from the path is
39.8 nC m
1
in good agreement with previous calcu-
lations [94].
5.4. Born charges
The Born charge of an atom a at position u
a
in a solid
is defined as:
Z
a
ij
=
V
e
P
i
u
aj
E=0
. (9)
It can be understood as an effective charge assigned to
the atom to match the change in polarisation in dir-
ection i when its position is perturbed in direction j.
Since the polarisation density and the atomic position
are both vectors, the Born charge of an atom is a rank-
2 tensor. The Born charge is calculated as a finite dif-
ference and relies on the Modern theory of polarisa-
tion [95] for the calculation of polarisation densities,
see reference [96] for more details. The Born charge
has been calculated for all stable materials in C2DB
with a finite PBE band gap.
It is of interest to examine the relation between the
Born charge and the Bader charge (see section 5.2). In
materials with strong ionic bonds one would expect
the charges to follow the atoms. On the other hand,
in covalently bonded materials the hybridisation pat-
tern and thus the charge distribution, depends on the
atom positions in a complex way, and the idea of
charges following the atom is expected to break down.
In agreement with this idea, the (in-plane) Born
charges in the strongly ionic hexagonal hBN (±2.71e
for B and N, respectively) are in good agreement
with the calculated Bader charges (±3.0e). In con-
trast, (the in-plane) Born charges in MoS
2
(1.08e
and 0.54e for Mo and S, respectively) deviate signi-
ficantly from the Bader charges (1.22e and 0.61e for
Mo and S, respectively). In fact, the values disagree
even on the sign of the charges underlining the non-
intuitive nature of the Born charges in covalently bon-
ded materials.
Note that the out-of-plane Born charges never
match the Bader charges, even for strongly ionic insu-
lators, and are consistently smaller in value than the
in-plane components. The smaller out-of-plane val-
ues are consistent with the generally smaller out-of-
plane polarisability of 2D materials (for both elec-
tronic and phonon contributions) and agrees with the
intuitive expectation that it is more difficult to polar-
ise a 2D material in the out-of-plane direction as com-
pared to the in-plane direction.
Figure 16 shows the average of the diagonal of
the Born charge tensor, Tr(Z
a
)/3, plotted against the
Bader charges for all 585 materials in the C2DB for
which the Born charges have been computed. The
data points have been coloured according to the ion-
icity of the atom a defined as I(a) = |χ
a
χ⟩|, where
χ
a
and χ are the Pauling electronegativity of atom
a and the average electronegativity of all atoms in the
15
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 16. Born charges, Tr(Z)/3, vs. Bader charges for
3025 atoms in the 585 materials for which the Born charges
are calculated. The colors indicate the ionicity of the atoms
(see main text).
Figure 17. Bader and in-plane Born charges vs. band gap.
unit cell, respectively. The ionicity is thus a measure
of the tendency of an atom to donate/accept charge
relative to the average tendency of atoms in the mater-
ial. It is clear from figure 16 that there is a larger
propensity for the Born and Bader charges to match
in materials with higher ionicity.
Figure 17 plots the average (in-plane) Born charge
and the Bader charge versus the band gap. It is clear
that large band gap materials typically exhibit integer
Bader charges, whereas there is no clear correlation
between the Born charge and the band gap.
5.5. Infrared polarisability
The original C2DB provided the frequency depend-
ent polarisability computed in the random phase
approximation (RPA) with inclusion of electronic
interband and intraband (for metals) transitions [6].
However, phonons carrying a dipole moment (so-
called IR active phonons) also contribute to the polar-
isability at frequencies comparable to the frequency of
optical phonons. This response is described by the IR
polarisability:
Figure 18. Total polarisability, including both electrons and
phonons, of monolayer hBN in the infrared (IR) frequency
regime. The resonance at around 180 meV is due to the
Γ-point longitudinal optical phonon. At energies above all
phonon frequencies (but below the band gap) the
polarisability is approximately constant and equal to the
static limit of the electronic polarisability,
α
.
α
IR
(ω) =
e
2
A
Z
T
M
1/2
(
i
d
i
d
T
i
ω
2
i
ω
2
iγω
)
M
1/2
Z,
(10)
where Z and M are matrix representations of the Born
charges and atomic masses, ω
2
i
and d
i
are eigenvectors
and eigenvalues of the dynamical matrix, A is the in-
plane cell area and γ is a broadening parameter rep-
resenting the phonon lifetime and is set to 10 meV.
The total polarisability is then the sum of the elec-
tronic polarisability and the IR polarisability.
The new C2DB includes the IR polarisability of
all monolayers for which the Born charges have been
calculated (stable materials with a finite band gap),
see section (5.4). As an example, figure 18 shows the
total polarisability of monolayer hexagonal hBN. For
details on the calculation of the IR polarisability see
reference [96].
5.6. Piezoelectric tensor
The piezoelectric effect is the accumulation of
charges, or equivalently the formation of an electric
polarisation, in a material in response to an applied
mechanical stress or strain. It is an important material
characteristic with numerous scientific and techno-
logical applications in sonar, microphones, acceler-
ometers, ultrasonic transducers, energy conversion,
etc [97, 98]. The change in polarisation originates
from the movement of positive and negative charge
centres as the material is deformed.
Piezoelectricity can be described by the (proper)
piezoelectric tensor c
ijk
with i, j,k {x, y,z}, given by
[99]:
c
ijk
=
e
2πV
X
l
ϕ
l
ϵ
jk
a
li
, (11)
which differs from equation (5) only by a derivative of
the polarisation phase with respect to the strain tensor
16
2D Mater. 8 (2021) 044002 M N Gjerding et al
Table 4. Comparison of computed piezoelectric tensor versus
experimental values and previous calculations for hexagonal BN
and a selected set of TMDCs (space group 187). All numbers are
in units of nC/m. Experimental data for MoS
2
is obtained from
[102].
Material Exp. Theory [101] C2DB
BN 0.14 0.13
MoS
2
0.3 0.36 0.35
MoSe
2
0.39 0.38
MoTe
2
0.54 0.48
WS
2
0.25 0.24
WSe
2
0.27 0.26
WTe
2
0.34 0.34
ϵ
jk
. Note that c
ijk
does not depend on the chosen
branch cut.
The piezoelectric tensor is a symmetric
tensor with at most 18 independent components.
Furthermore, the point group symmetry restricts the
number of independent tensor elements and their
relationships due to the well-known Neumanns prin-
ciple [100]. For example, monolayer MoS
2
with point
group D
3h
, has only one non-vanishing independ-
ent element of c
ijk
. Note that c
ijk
vanishes identic-
ally for centrosymmetric materials. Using a finite-
difference technique with a finite but small strain
(1% in our case), equation (11) has been used to
compute the proper piezoelectric tensor for all non-
centrosymmetric materials in the C2DB with a finite
band gap. Table 4 shows a comparison of the piezo-
electric tensors in the C2DB with literature for a
selected set of monolayer materials. Good agreement
is obtained for all these materials.
5.7. Topological invariants
For all materials in the C2DB exhibiting a direct band
gap below 1 eV, the k-space Berry phase spectrum
of the occupied bands has been calculated from the
PBE wave functions. Specifically, a particular k-point
is written as k
1
b
1
+ k
2
b
2
and the Berry phases γ
n
(k
2
)
of the occupied states on the path k
1
= 0 k
1
= 1 is
calculated for each value of k
2
. The connectivity of
the Berry phase spectrum determines the topological
properties of the 2D Bloch Hamiltonian [103, 104].
The calculated Berry phase spectra of the relev-
ant materials are available for visual inspection on the
C2DB webpage. Three different topological invari-
ants have been extracted from these spectra and are
reported in the C2DB: (1) The Chern number, C,
takes an integer value and is well defined for any
gapped 2D material. It determines the number of
chiral edge states on any edge of the material. For any
non-magnetic material the Chern number vanishes
due to time-reversal symmetry. It is determined from
the Berry phase spectrum as the number of crossings
at any horizontal line in the spectrum. (2) The mir-
ror Chern number, C
M
, defined for gapped mater-
ials with a mirror plane in the atomic layer [105].
For such materials, all states may be chosen as mirror
eigenstates with eigenvalues ±i and the Chern num-
bers C
±
can be defined for each mirror sector separ-
ately. For a material with vanishing Chern number,
the mirror Chern number is defined as C
M
= (C
+
C
)/2 and takes an integer value corresponding to
the number of edge states on any mirror symmetry
preserving edge. It is obtained from the Berry phase
spectrum as the number of chiral crossings in each
of the mirror sectors. (3) The Z
2
invariant, ν, which
can take the values 0 and 1, is defined for materi-
als with time-reversal symmetry. Materials with ν = 1
are referred to as quantum spin Hall insulators and
exhibit helical edge states at any time-reversal con-
serving edge. It is determined from the Berry phase
spectrum as the number of crossing points modulus
2 at any horizontal line in the interval k
2
[0, 1/2].
Figure 19 shows four representative Berry phase
spectra corresponding to the three cases of non-
vanishing C, C
M
and ν as well as a trivial insulator.
The four materials are: OsCl
3
(space group 147)—
a Chern insulator with C = 1, OsTe
2
(space group
14)—a mirror crystalline insulator with C
M
= 2, SbI
(spacegroup 1)—a quantum spin Hall insulator with
ν = 1 and BiITe (spacegroup 156)—a trivial insulator.
Note that a gap in the Berry phase spectrum always
implies a trivial insulator.
In [106] the C2DB was screened for materi-
als with non-trivial topology. At that point it was
found that the database contained 7 Chern insulat-
ors, 21 mirror crystalline topological insulators and
48 quantum spin Hall insulators. However, that does
not completely exhaust the the topological proper-
ties of materials in the C2DB. In particular, there
may be materials that can be topologically classified
based on crystalline symmetries other than the mirror
plane of the layer. In addition, second order topolo-
gical effects may be present in certain materials, which
imply that flakes will exhibit topologically protected
corner states. Again, the Berry phase spectra may be
used to unravel the second order topology by means
of nested Wilson loops [107].
5.8. Exchange coupling constants
The general C2DB workflow described in
sections 2.12.3 will identify the FM ground state
of a material and apply it as starting point for sub-
sequent property calculations, whenever it is more
stable than the spin-paired ground state. In reality,
however, the FM state is not guaranteed to comprise
the magnetic ground state. In fact, AFM states often
have lower energy than the FM one, but in general
it is non-trivial to obtain the true magnetic ground
state. We have chosen to focus on the FM state due
to its simplicity and because its atomic structure and
stability are often very similar to those of other mag-
netic states. Whether or not the FM state is the true
magnetic ground state is indicated by the nearest
neighbour exchange coupling constant as described
below.
17
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 19. Berry phase spectra of the Chern insulator OsCl
3
(top left), the crystalline topological insulator OsTe
2
(top right), the
quantum spin Hall insulator SbI (lower left) and the trivial insulator BiITe (lower right).
When investigating magnetic materials the ther-
modynamical properties (for example the critical
temperatures for ordering) are of crucial interest. In
two dimensions the Mermin–Wagner theorem [108]
comprises an extreme example of the importance of
thermal effects since it implies that magnetic order
is only possible at T = 0 unless the spin-rotational
symmetry is explicitly broken. The thermodynamic
properties cannot be accessed directly by DFT. Con-
sequently, magnetic models that capture the crucial
features of magnetic interactions must be employed.
For insulators, the Heisenberg model has proven
highly successful in describing magnetic properties
of solids in 3D as well as 2D [109]. It represents the
magnetic degrees of freedom as a lattice of localised
spins that interact through a set of exchange coup-
ling constants. If the model is restricted to include
only nearest neighbour exchange and assume mag-
netic isotropy in the plane, it reads:
H =
J
2
X
ij
S
i
·S
j
λ
2
X
ij
S
z
i
S
z
j
A
X
i
S
z
i
2
, (12)
where J is the nearest neighbour exchange constant,
λ is the nearest neighbour anisotropic exchange con-
stant and A measures the strength of single-ion aniso-
tropy. We also neglect off-diagonal exchange coup-
ling constants that give rise to terms proportional to
S
x
i
S
y
j
, S
y
i
S
z
j
and S
z
i
S
x
j
. The out-of-plane direction has
been chosen as z and ij implies that for each site i
we sum over all nearest neighbour sites j. The para-
meters J, λ and A may be obtained from an energy
mapping analysis involving four DFT calculations
with different spin configurations [70, 110, 111]. The
thermodynamic properties of the resulting ‘first prin-
ciples Heisenberg model’ may subsequently be ana-
lysed with classical Monte Carlo simulations or renor-
malised spin wave theory [36, 112].
The C2DB provides the values of J, λ, and A as
well as the number of nearest neighbours N
nn
and
the maximum eigenvalue of S
z
(S), which is obtained
from the total magnetic moment per atom in the
FM ground state (rounded to nearest half-integer for
metals). These key parameters facilitate easy post-
processing analysis of thermal effects on the magnetic
structure. In [113] such an analysis was applied to
estimate the critical temperature of all FM materials
in the C2DB based on a model expression for T
C
and
the parameters from equation (12).
For metals, the Heisenberg parameters available
in C2DB should be used with care because the Heis-
enberg model is not expected to provide an accur-
ate description of magnetic interactions in this case.
Nevertheless, even for metals the sign and magnitude
of the parameters provide an important qualitative
measure of the magnetic interactions that may be
used to screen and select materials for more detailed
investigations of magnetic properties.
18
2D Mater. 8 (2021) 044002 M N Gjerding et al
A negative value of J implies the existence of an
AFM state with lower energy than the FM state used
in C2DB. This parameter is thus crucial to consider
when judging the stability and relevance of a mater-
ial classified as magnetic in C2DB (see section 2.5).
Figure 20 shows the distribution of exchange coupling
constants (weighted by S
2
) of the magnetic materials
in the C2DB. The distribution is slightly skewed to the
positive side indicating that FM order is more com-
mon than AFM order.
The origin of magnetic anisotropy may stem from
either single-ion anisotropy or anisotropic exchange
and it is in general difficult a priori to determ-
ine, which mechanism is most important. There
is, however, a tendency in the literature to neglect
anisotropic exchange terms in a Heisenberg model
description of magnetism and focus solely on the
single-ion anisotropy. In figure 20 we show a scat-
ter plot of the anisotropy parameters A and λ for the
FM materials (J > 0). The spread of the parameters
indicate that the magnetic anisotropy is in general
equally likely to originate from both mechanisms
and neglecting anisotropic exchange is not advis-
able. For ferromagnets, the model (equation (12))
only exhibits magnetic order at finite temperatures if
A(2S 1) + λN
nn
> 0 [113]. Neglecting anisotropic
exchange thus excludes materials with A < 0 that sat-
isfies A(2S 1) + λN
nn
> 0. This is in fact the case for
11 FM insulators and 31 FM metals in the C2DB.
5.9. Raman spectrum
Raman spectroscopy is an important technique used
to probe the vibrational modes of a solid (or
molecule) by means of inelastic scattering of light
[114]. In fact, Raman spectroscopy is the domin-
ant method for characterising 2D materials and can
yield detailed information about chemical composi-
tion, crystal structure and layer thickness. There exist
several different types of Raman spectroscopies that
differ mainly by the number of photons and phon-
ons involved in the scattering process [114]. The first-
order Raman process, in which only a single phonon
is involved, is the dominant scattering process in
samples with low defect concentrations.
In a recent work, the first-order Raman spec-
tra of 733 monolayer materials from the C2DB were
calculated, and used as the basis for an automatic
procedure for identifying a 2D material entirely from
its experimental Raman spectrum [115]. The Raman
spectrum is calculated using third-order perturba-
tion theory to obtain the rate of scattering processes
involving creation/annihilation of one phonon and
two photons, see reference [115] for details. The
light field is written as F (t) = F
in
u
in
exp(iω
in
t) +
F
out
u
out
exp(iω
out
t)+c.c. where F
in/out
and ω
in/out
denote the amplitudes and frequencies of the
input/output electromagnetic fields, respectively. In
addition, u
in/out
=
P
i
u
i
in/out
e
i
are the correspond-
ing polarisation vectors, where e
i
denotes the unit
Figure 20. Top: Distribution of exchange coupling
constants in C2DB. Bottom: Single-ion anisotropy A vs
anisotropic exchange λ for ferromagnetic materials with
S > 1/2. The shaded area indicates the part of parameter
space where the model (equation (12)) does not yield an
ordered state at finite temperatures.
vector along the i-direction with i {x,y,z}. Using
this light field, the final expression for the Stokes
Raman intensity involving scattering events by only
one phonon reads [115]:
I(ω) = I
0
X
ν
n
ν
+ 1
ω
ν
X
ij
u
i
in
R
ν
ij
u
j
out
2
δ(ω ω
ν
).
(13)
Here, I
0
is an unimportant constant (since Raman
spectra are always reported normalised), and n
ν
is obtained from the Bose–Einstein distribution,
i.e. n
ν
(exp[ω
ν
/k
B
T] 1)
1
at temperature T
for a Raman mode with energy ω
ν
. Note that
only phonons at the Brillouin zone center (with
zero momentum) contribute to the one-phonon
Raman processes due to momentum conservation. In
equation (13), R
ν
ij
is the Raman tensor for phonon
mode ν, which involves electron–phonon and dipole
matrix elements as well as the electronic trans-
ition energies and the incident excitation frequency.
Equation (13) has been used to compute the Raman
spectra of the 733 most stable, non-magnetic mono-
layers in C2DB for a range of excitation frequen-
cies and polarisation configurations. Note that the
Raman shift ω is typically expressed in cm
1
with
19
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 21. Comparison of the calculated and experimental (extracted from [62]) Raman spectrum of MoS
2
(left) and MoSSe
(right). The excitation wavelength is 532 nm, and both the polarisation of both the incoming and outgoing photons are along the
y-direction. The Raman peaks are labelled according to the irreducible representations of the corresponding vibrational modes.
Adapted from [115].
1 meV equivalent to 8.0655 cm
1
. In addition, for
generating the Raman spectra, we have used a Gaus-
sian [G(ω) = (σ
2π)
1
exp(ω
2
/2σ
2
)] with a vari-
ance σ = 3 cm
1
to replace the Dirac delta function,
which accounts for the inhomogeneous broadening
of phonon modes.
As an example, figure 21 shows the calcu-
lated Raman spectrum of monolayer MoS
2
and the
Janus monolayer MoSSe (see section 4.1). Experi-
mental Raman spectra extracted from reference [62]
are shown for comparison. For both materials,
good agreement between theory and experiment is
observed for the peak positions and relative amp-
litudes of the main peaks. The small deviations can
presumably be attributed to substrate interactions
and defects in the experimental samples as well as
the neglect of excitonic effects in the calculations.
The qualitative differences between the Raman spec-
tra can be explained by the different point groups
of the materials (C
3v
and D
3h
, respectively), see ref-
erence [115]. In particular, the lower symmetry of
MoSSe results in a lower degeneracy of its vibrational
modes leading to more peaks in the Raman spectrum.
Very recently, the Raman spectra computed
from third order perturbation theory as described
above, were supplemented by spectra obtained from
the more conventional Kramers–Heisenberg–Dirac
(KHD) approach. Within the KHD method, the
Raman tensor is obtained as the derivative of the static
electric polarisability (or equivalently, the susceptib-
ility) along the vibrational normal modes [116, 117]:
R
ν
ij
=
X
αl
χ
(1)
ij
r
αl
v
ν
αl
M
α
. (14)
Here, χ
(1)
ij
is the (first-order) susceptibility tensor, r
α
and M
α
are the position and atomic mass of atom
α, respectively, and v
ν
αl
is the eigenmode of phonon
ν. The two approaches, i.e. the KHD and third-order
perturbation approach, can be shown to be equi-
valent [114], at least when local field effects can be
ignored as is typically the case for 2D materials [35].
We have also confirmed this equivalence from our
calculations. Furthermore, the computational cost of
both methods is also similar [115]. However, the
KHD approach typically converge faster with respect
to both the number of bands and k-grid compared
to the third-order perturbation method. This stems
from the general fact that higher-order perturba-
tion calculations converge slower with respect to k-
grid and they require additional summations over a
complete basis set (virtual states) and hence a lar-
ger number of bands [118]. Currently, Raman spec-
tra from both approaches can be found at the C2DB
website.
5.10. Second harmonics generation
Nonlinear optical (NLO) phenomena such as har-
monic generation, Kerr, and Pockels effects are
of great technological importance for lasers, fre-
quency converters, modulators, etc. In addition,
NLO spectroscopy has been extensively employed
to obtain insight into materials properties [119]
that are not accessible by e.g. linear optical spec-
troscopy. Among numerous nonlinear processes,
second-harmonic generation (SHG) has been widely
used for generating new frequencies in lasers as well
as identifying crystal orientations and symmetries.
Recently, the SHG spectrum was calculated for
375 non-magnetic, non-centrosymmetric semicon-
ducting monolayers of the C2DB, and multiple 2D
materials with giant optical nonlinearities were iden-
tified [120]. In the SHG process, two incident photons
at frequency ω generate an emitted photon at fre-
quency of 2ω. Assume that a mono-harmonic electric
20
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 22. (Left panel) SHG spectra of monolayer Ge
2
Se
2
, where only non-vanishing independent tensor elements are shown.
The vertical dashed lines mark ω = E
g
/2 and ω = E
g
, respectively. The crystal structure of Ge
2
Se
2
structure is shown in the
inset. (Right panel) The rotational anisotropy of the static (ω = 0) SHG signal for parallel (blue) and perpendicular (red)
polarisation configurations with θ defined with respect to the crystal x-axis.
field written F (t) =
P
i
F
i
e
i
e
iω t
+c.c. is incident on
the material, where e
i
denotes the unit vector along
direction i {x, y, z}. The electric field induces a SHG
polarisation density P
(2)
, which can be obtained from
the quadratic susceptibility tensor χ
(2)
ijk
,
P
(2)
i
(t) = ϵ
0
X
jk
χ
(2)
ijk
(ω,ω)F
i
F
j
e
2iω t
+ c.c., (15)
where ε
0
denotes the vacuum permittivity. χ
(2)
ijk
is a
symmetric (due to intrinsic permutation symmetry
i.e. χ
(2)
ijk
= χ
(2)
ijk
) rank-3 tensor with at most 18 inde-
pendent elements. Furthermore, similar to the piezo-
electric tensor, the point group symmetry reduces the
number of independent tensor elements.
In the C2DB, the quadratic susceptibility is calcu-
lated using density matrices and perturbation theory
[118, 121] with the involved transition dipole mat-
rix elements and band energies obtained from DFT.
The use of DFT single-particle orbitals implies that
excitonic effects are not accounted for. The number
of empty bands included in the sum over bands was
set to three times the number of occupied bands.
The width of the Fermi–Dirac occupation factor was
set to k
B
T = 50 meV, and a line-shape broadening
of η = 50 meV was used in all spectra. Furthermore,
time-reversal symmetry was imposed in order to
reduce the k-integrals to half the BZ. For various 2D
crystal classes, it was verified by explicit calculation
that the quadratic tensor elements fulfil the expec-
ted symmetries, e.g. that they all vanish identically for
centrosymmetric crystals.
As an example, the calculated SHG spectra for
monolayer Ge
2
Se
2
is shown in figure 22 (left panel).
Monolayer Ge
2
Se
2
has five independent tensor ele-
ments, χ
(2)
xxx
, χ
(2)
xyy
, χ
(2)
xzz
, χ
(2)
yyx
= χ
(2)
yxy
, and χ
(2)
zzx
=
χ
(2)
zxz
, since it is a group-IV dichalcogenide with an
orthorhombic crystal structure (space group 31 and
point group C
2v
). Note that, similar to the linear
susceptibility, the bulk quadratic susceptibility (with
SI units of m V
1
) is ill-defined for 2D materi-
als (since the volume is ambiguous) [120]. Instead,
the unambiguous sheet quadratic susceptibility (with
SI units of m
2
V
1
) is evaluated. In addition to
the frequency-dependent SHG spectrum, the angu-
lar dependence of the static (ω = 0) SHG intens-
ity at normal incidence for parallel and perpendic-
ular polarisations (relative to the incident electric
field) is calculated, see figure 22 (right panel). Such
angular resolved SHG spectroscopy has been widely
used for determining the crystal orientation of 2D
materials. The calculated SHG spectra for all non-
vanishing inequivalent polarisation configurations
and their angular dependence, are available in the
C2DB.
Since C2DB has already gathered various mater-
ial properties of numerous 2D materials, it provides
a unique opportunity to investigate interrelations
between different material properties. For example,
the strong dependence of the quadratic optical
response on the electronic band gap was demon-
strated on basis of the C2DB data [120]. As another
example of a useful correlation, the static quadratic
susceptibility is plotted versus the static linear sus-
ceptibility for 67 TMDCs (with formula MX
2
, space
group 187) in figure 23. Note that for materials with
several independent tensor elements, only the largest
is shown. There is a very clear correlation between
the two quantities. This is not unexpected as both
21
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 23. Scatter plot (double log scale) of the static sheet
quadratic susceptibility |χ
(2)
ijk
| versus the static sheet linear
susceptibility |χ
(1)
ij
| for 67 TMDCs (with chemical formula
MX
2
and space group 187). A few well known materials are
highlighted.
the linear and quadratic optical responses are func-
tions of the transition dipole moments and transition
energies. More interestingly, the strength of the quad-
ratic response seems to a very good approximation to
be given by a universal constant times the linear sus-
ceptibility to the powerof three (ignoring polarisation
indices), i.e.
χ
(2)
(0,0) Aχ
(1)
(0)
3
, (16)
where A is only weakly material dependent. Note that
this scaling law is also known in classical optics as
semi-empirical Miller’s rule for non-resonant quad-
ratic responses [122], which states that the second
order electric susceptibility is proportional to the
product of the first-order susceptibilities at the three
frequencies involved.
6. Machine learning properties
In recent years, material scientists have shown great
interest in exploiting the use of machine learning
(ML) techniques for predicting materials properties
and guiding the search for new materials. ML is the
scientific study of algorithms and statistical models
that computer systems can use to perform a specific
task without using explicit instructions but instead
relying on patterns and inference. Within the domain
of materials science, one of the most frequent prob-
lems is the mapping from atomic configuration to
material property, which can be used e.g. to screen
large material spaces in search of optimal candidates
for specific applications [123, 124].
In the ML literature, the mathematical represent-
ation of the input observations is often referred to as
a fingerprint. Any fingerprint must satisfy a number
of general requirements [125]. In particular, a finger-
print must be:
(a) Complete: The fingerprint should incorporate all
the relevant input for the underlying problem,
i.e. materials with different properties should
have different fingerprints.
(b) Compact: The fingerprint should contain no or
a minimal number of features redundant to the
underlying problem. This includes being invari-
ant to rotations, translations and other trans-
formations that leave the properties of the system
invariant.
(c) Descriptive: Materials with similar target values
should have similar fingerprints.
(d) Simple: The fingerprint should be efficient to
evaluate. In the present context, this means that
calculating the fingerprint should be signific-
antly faster than calculating the target property.
Several types of atomic-level materials finger-
prints have been proposed in the literature, includ-
ing general purpose fingerprints based on atom-
istic properties [126, 127] possibly encoding inform-
ation about the atomic structure, i.e. atomic pos-
itions [125, 128, 129], and specialised fingerprints
tailored for specific applications (materials/proper-
ties) [130, 131].
The aim of this section is to demonstrate how
the C2DB may be utilised for ML-based prediction
of general materials properties. Moreover, the study
serves to illustrate the important role of the finger-
print for such problems. The 2D materials are repres-
ented using three different fingerprints: two popular
structural fingerprints and a more advanced finger-
print that encodes information about the electronic
structure via the PDOS. The target properties include
the HSE06 band gap, the PBE heat of formation
(H), the exciton binding energy (E
B
) obtained from
the many-body BSE, the in-plane static polarisability
calculated in the RPA averaged over the x and y polar-
isation directions (α
i
), and the in-plane Voigt mod-
ulus (C
ii
) defined as
1
4
(C
11
+ C
22
+ 2C
12
), where C
ij
is a component of the elastic stiffness tensor in Man-
del notation.
To introduce the data, figure 24 shows pair-plots
of the dual-property relations of these properties. The
plots in the diagonal show the single-property histo-
grams, whereas the off-diagonals show dual-property
scatter plots below the diagonal and histograms above
the diagonal. Clearly, there are only weak correla-
tions between most of the properties, with the largest
degree of correlation observed between the HSE06
gap and exciton binding energy. The lack of strong
correlations motivates the use of ML for predicting
the properties.
The prediction models are build using the Ewald
sum matrix and many-body tensor representation
(MBTR) as structural fingerprints. The Ewald finger-
print is a version of the simple Coulomb matrix fin-
gerprint [128] modified to periodic systems [125].
The MBTR encodes first, second and third order
22
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 24. Pair-plot of selected properties from C2DB. The diagonal contains the single property histograms. Below the diagonal
are two-property scatter plots showing the correlation between properties and above the diagonal are two-property histograms.
properties include the HSE06 band gap, the PBE heat of formation (H), the exciton binding energy (E
B
) calculated from the
BSE, the in-plane static polarisability calculated in the RPA and averaged over the x and y polarisation directions (α
i
), and the
in-plane Voigt modulus (C
ii
) defined as
1
4
(C
11
+ C
22
+ 2C
12
), where C
ij
is a component of the elastic stiffness tensor.
terms like atomic numbers, distances and angles
between atoms in the system [129]. As an alternative
to the structural fingerprints, a representation based
on the PBE PDOS is also tested. This fingerprint
6
encodes the coupling between the PDOS at different
atomic orbitals in both energy and real space. It is
defined as:
ρ
νν
(E,R) =
X
acell
X
a
ρ
aν
(E)ρ
a
ν
(E)G
×(R |R
a
R
a
|), (17)
where G is a Gaussian smearing function, a denotes
the atoms, ν denotes atomic orbitals, and the PDOS
is given by:
ρ
aν
(E) =
X
n
|⟨ψ
n
|aν⟩|
2
G(E ϵ
n
), (18)
6
Details will be published elsewhere.
where n runs over all eigenstates of the system. Since
this fingerprint requires a DFT-PBE calculation to
be performed, additional features derivable from the
DFT calculation can be added to the fingerprint. In
this study, the PDOS fingerprint is amended by the
PBE band gap. The latter can in principle be extrac-
ted from the PDOS, but its explicit inclusion has been
found to improve the performance of the model.
A Gaussian process regression using a simple
Gaussian kernel with a noise component is used as
learning algorithm. The models are trained using 5-
fold cross validation on a training set consisting of
80% of the materials with the remaining 20% held
aside as test data. Prior to training the model, the
input space is reduced to 50 features using principal
component analysis (PCA). This step is necessary to
reduce the huge number of features in the MBTR
fingerprint to a manageable size. Although this is
not required for the Ewald and PDOS fingerprints,
23
2D Mater. 8 (2021) 044002 M N Gjerding et al
Figure 25. Prediction scores (MAE normalised to standard deviation of property values) for the test sets of selected properties
using a Gaussian process regression.
Figure 26. ML predicted HSE06 gap values vs. true values for Ewald, MBTR and PDOS fingerprints with MAE’s for train and test
set included. The PDOS is found to perform significantly better for the prediction of HSE06 gap.
we perform the same feature reduction in all cases.
The optimal number of features depends on the
choice of fingerprint, target property and learning
algorithm, but for consistency 50 PCA components
are used for all fingerprints and properties in this
study.
Figure 25 shows the prediction scores obtained
for the five properties using the three different fin-
gerprints. The employed prediction score is the mean
absolute error of the test set normalised by the
standard deviation of the property values (stand-
ard deviations are annotated in the diagonal plots
in figure 24). In general, the PDOS fingerprint out-
performs the structural fingerprints. The difference
between prediction scores is smallest for the static
polarisability α
i
and largest for the HSE06 gap. It
should be stressed that although the evaluation of
the PBE-PDOS fingerprint is significantly more time
consuming than the evaluation of the structural fin-
gerprints, it is still much faster than the evaluation of
all the target properties. Moreover, structural finger-
prints require the atomic structure, which in turns
requires a DFT structure optimisation (unless the
structure is available by other means).
The HSE06 band gap shows the largest sensitiv-
ity to the employed fingerprint. To elaborate on the
HSE06 results, figure 26 shows the band gap predicted
using each of the three different fingerprints plotted
against the true band gap. The mean absolute errors
on the test set is 0.95 and 0.74 eV for Ewald and
MBTR fingerprints, respectively, while the PDOS sig-
nificantly outperforms the other fingerprints with a
test MAE of only 0.21 eV. This improvement in pre-
diction accuracy is partly due to the presence of the
PBE gap in the PDOS fingerprint. However, our ana-
lysis shows that the pure PDOS fingerprint without
the PBE gap still outperforms the structural finger-
prints. Using only the PBE gap as feature results in a
test MAE of 0.28 eV.
The current results show that the precision of ML-
based predictions are highly dependent on the type
of target property and the chosen material repres-
entation. For some properties, the mapping between
atomic structure and property is easier to learn while
24
2D Mater. 8 (2021) 044002 M N Gjerding et al
others might require more/deeper information, e.g.
in terms of electronic structure fingerprints. Our res-
ults clearly demonstrate the potential of encoding
electronic structure information into the material fin-
gerprint, and we anticipate more work on this relev-
ant and exciting topic in the future.
7. Summary and outlook
We have documented a number of extensions and
improvements of the C2DB made in the period 2018–
2020. The new developments include: (1) A refined
and more stringent workflow for filtering prospect-
ive 2D materials and classifying them according to
their crystal structure, magnetic state and stability.
(2) Improvements of the methodology used to com-
pute certain challenging properties such as the full
stiffness tensor, effective masses, G
0
W
0
band struc-
tures, and optical absorption spectra. (3) New mater-
ials including 216 MXY Janus monolayers and 574
monolayers exfoliated from experimentally known
bulk crystals. In addition, ongoing efforts to system-
atically obtain and characterise bilayers in all possible
stacking configurations as well as point defects in the
semiconducting monolayers, have been described.
(4) New properties including exfoliation energies,
spontaneous polarisations, Bader charges, piezoelec-
tric tensors, IR polarisabilities, topological invariants,
magnetic exchange couplings, Raman spectra, and
SHG spectra. It should be stressed that the C2DB will
continue to grow as new structures and properties are
being added, and thus the present paper should not be
seen as a final report on the C2DB but rather a snap-
shot of its current state.
In addition to the above mentioned improve-
ments relating to data quantity and quality, the C2DB
has been endowed with a comprehensive document-
ation layer. In particular, all data presented on the
C2DB website are now accompanied by an inform-
ation field that explains the meaning and representa-
tion (if applicable) of the data and details how it was
calculated thus making the data easier to understand,
reproduce, and deploy.
The C2DB has been produced using the ASR
in combination with the GPAW electronic structure
code and the MyQueue task and workflow scheduling
system. The ASR is a newly developed Python-based
framework designed for high-throughput materi-
als computations. The highly flexible and modular
nature of the ASR and its strong coupling to the well
established community-driven ASE project, makes
it a versatile framework for both high- and low-
throughput materials simulation projects. The ASR
and the C2DB-ASR workflow are distributed as open
source code. A detailed documentation of the ASR
will be published elsewhere.
While the C2DB itself is solely concerned with
the properties of perfect monolayer crystals, ongo-
ing efforts focus on the systematic characterisation
of homo-bilayer structures as well as point defects in
monolayers. The data resulting from these and other
similar projects will be published as separate, inde-
pendent databases, but will be directly interlinked
with the C2DB making it possible to switch between
them in a completely seamless fashion. These devel-
opments will significantly broaden the scope and
usability of the C2DB+ (+ stands for associated data-
bases) that will help theoreticians and experimental-
ists to navigate one of the most vibrant and rapidly
expanding research fields at the crossroads of con-
densed matter physics, photonics, nanotechnology,
and chemistry.
Data availability statement
The data that support the findings of this study are
openly available at the following URL/DOI: https://
doi.org/10.11583/DTU.14616660.
Acknowledgments
The Center for Nanostructured Graphene (CNG) is
sponsored by the Danish National Research Found-
ation, Project DNRF103. This project has received
funding from the European Research Council (ERC)
under the European Unions Horizon 2020 research
and innovation program Grant Agreement No.
773122 (LIMA) and Grant Agreement No. 951786
(NOMAD CoE). T D acknowledges financial sup-
port from the German Research Foundation (DFG
Projects No. DE 2749/2-1).
ORCID iDs
Morten Niklas Gjerding https://orcid.org/0000-
0002-5256-660X
Alireza Taghizadeh https://orcid.org/0000-0003-
0876-9538
Asbjørn Rasmussen https://orcid.org/0000-0001-
7110-9255
Sajid Ali https://orcid.org/0000-0001-7865-2664
Fabian Bertoldo https://orcid.org/0000-0002-
1219-8689
Thorsten Deilmann https://orcid.org/0000-0003-
4165-2446
Nikolaj Rørbæk Knøsgaard
https://orcid.org/0000-0003-3709-5464
Mads Kruse https://orcid.org/0000-0002-0599-
5110
Ask Hjorth Larsen https://orcid.org/0000-0001-
5267-6852
Simone Manti https://orcid.org/0000-0003-3770-
0863
Thomas Garm Pedersen https://orcid.org/0000-
0002-9466-6190
Urko Petralanda https://orcid.org/0000-0003-
0226-0028
25
2D Mater. 8 (2021) 044002 M N Gjerding et al
Thorbjørn Skovhus https://orcid.org/0000-0001-
5215-6419
Mark Kamper Svendsen https://orcid.org/0000-
0001-9718-849X
Jens Jørgen Mortensen https://orcid.org/0000-
0001-5090-6706
Thomas Olsen https://orcid.org/0000-0001-6256-
9284
Kristian Sommer Thygesen
https://orcid.org/0000-0001-5197-214X
References
[1] Kohn W and Sham L J 1965 Phys. Rev. 140 A1133
[2] Schwierz F 2010 Nat. Nanotechnol. 5 487
[3] Novoselov K, Mishchenko A, Carvalho A and Castro
Neto A 2016 Science 353 6298
[4] Ferrari A C et al 2015 Nanoscale 7 4598–810
[5] Bhimanapati G R et al 2015 ACS Nano 9 11509–39
[6] Haastrup S et al 2018 2D Mater. 5 042002
[7] Shivayogimath A et al 2019 Nat. Commun. 10 1–7
[8] Zhou J et al 2018 Nature 556 355–9
[9] Anasori B, Lukatskaya M R and Gogotsi Y 2017 Nat. Rev.
Mater. 2 1–17
[10] Dou L et al 2015 Science 349 1518–21
[11] Mounet N et al 2018 Nat. Nanotechnol. 13 246–52
[12] Ashton M, Paul J, Sinnott S B and Hennig R G 2017 Phys.
Rev. Lett. 118 106101
[13] Geim A K and Grigorieva I V 2013 Nature 499 419–25
[14] Cao Y, Fatemi V, Fang S, Watanabe K, Taniguchi T,
Kaxiras E and Jarillo-Herrero P 2018 Nature 556 43–50
[15] Bistritzer R and MacDonald A H 2011 Proc. Natl Acad. Sci.
108 12233–7
[16] Zhao X et al 2020 Nature 581 171–7
[17] Wan J, Lacey S D, Dai J, Bao W, Fuhrer M S and Hu L 2016
Chem. Soc. Rev. 45 6742–65
[18] Wilkinson M D et al 2016 Sci. Data 3 1–9
[19] Wirtz L, Marini A and Rubio A 2006 Phys. Rev. Lett.
96 126104
[20] Cudazzo P, Tokatly I V and Rubio A 2011 Phys. Rev. B
84 085406
[21] Klots A et al 2014 Sci. Rep. 4 6608
[22] Chernikov A, Berkelbach T C, Hill H M, Rigosi A, Li Y,
Aslan O B, Reichman D R, Hybertsen M S and Heinz T F
2014 Phys. Rev. Lett. 113 076802
[23] Olsen T, Latini S, Rasmussen F and Thygesen K S 2016
Phys. Rev. Lett. 116 056401
[24] Riis-Jensen A C, Gjerding M N, Russo S and Thygesen K S
2020 Phys. Rev. B 102 201402
[25] Felipe H, Xian L, Rubio A and Louie S G 2020 Nat.
Commun. 11 1–10
[26] Sohier T, Gibertini M, Calandra M, Mauri F and Marzari N
2017 Nano Lett. 17 3758–63
[27] Ugeda M M et al 2014 Nat. Mater. 13 1091–5
[28] Winther K T and Thygesen K S 2017 2D Mater. 4 025059
[29] Wang Z, Rhodes D A, Watanabe K, Taniguchi T, Hone J C,
Shan J and Mak K F 2019 Nature 574 76–80
[30] Gong C et al 2017 Nature 546 265–9
[31] Huang B et al 2017 Nature 546 270–3
[32] Chang K et al 2016 Science 353 274–8
[33] Olsen T, Andersen E, Okugawa T, Torelli D, Deilmann T
and Thygesen K S 2019 Phys. Rev. Mater. 3 024005
[34] Marrazzo A, Gibertini M, Campi D, Mounet N and
Marzari N 2019 Nano Lett. 19 8431–40
[35] Thygesen K S 2017 2D Mater. 4 022004
[36] Torelli D and Olsen T 2018 2D Mater. 6 015028
[37] Rasmussen F A and Thygesen K S 2015 J. Phys. Chem. C
119 13169–83
[38] Enkovaara J et al 2010 J. Phys.: Condens. Matter. 22 253202
[39] Mortensen J J, Hansen L B and Jacobsen K W 2005 Phys.
Rev. B 71 035109
[40] Gjerding M, Skovhus T, Rasmussen A, Bertoldo F,
Larsen A H, Mortensen J J and Thygesen K S 2021 Atomic
simulation recipes—a python framework and library for
automated workflows (arXiv:2104.13431)
[41] Larsen A H et al 2017 J. Phys.: Condens. Matter. 29 273002
[42] Mortensen J J, Gjerding M and Thygesen K S 2020 J. Open
Source Softw. 5 1844
[43] Saal J E, Kirklin S, Aykol M, Meredig B and Wolverton C
2013 JOM 65 1501–9
[44] Jain A et al 2013 APL Mater. 1 011002
[45] Curtarolo S et al 2012 Comput. Mater. Sci. 58 218–26
[46] Ataca C, Sahin H and Ciraci S 2012 J. Phys. Chem. C
116 8983–99
[47] Lebègue S, Björkman T, Klintenberg M, Nieminen R M and
Eriksson O 2013 Phys. Rev. X 3 031002
[48] Korm
´
anyos A, Burkard G, Gmitra M, Fabian J, Zólyomi V,
Drummond N D and Fal’ko V 2015 2D Mater. 2 022001
[49] Zhou J et al 2019 Sci. Data 6 1–10
[50] Choudhary K, Kalish I, Beams R and Tavazza F 2017 Sci.
Rep. 7 1–16
[51] Larsen A H, Vanin M, Mortensen J J, Thygesen K S and
Jacobsen K W 2009 Phys. Rev. B 80 195112
[52] Larsen P M, Pandey M, Strange M and Jacobsen K W 2019
Phys. Rev. Mater. 3 034003
[53] Ong S P et al 2013 Comput. Mater. Sci. 68 314–19
[54] Patrick C E, Jacobsen K W and Thygesen K S 2015 Phys.
Rev. B 92 201205
[55] Kirklin S, Saal J E, Meredig B, Thompson A, Doak J W,
Aykol M, Rühl S and Wolverton C 2015 npj Computat.
Mater. 1 1–15
[56] Perdew J P, Burke K and Ernzerhof M 1996 Phys. Rev. Lett.
77 3865–8
[57] Ma
´
zdziarz M 2019 2D Mater. 6 048001
[58] Li Y and Heinz T F 2018 2D Mater. 5 025021
[59] Hadley L N and Dennison D 1947 J. Opt. Soc. Am.
37 451–65
[60] Wang G, Chernikov A, Glazov M M, Heinz T F, Marie X,
Amand T and Urbaszek B 2018 Rev. Mod. Phys.
90 021001
[61] Lu A Y et al 2017 Nat. Nanotechnol. 12 744–9
[62] Zhang J et al 2017 ACS Nano 11 8192–8
[63] Fülöp B et al 2018 2D Mater. 5 031013
[64] Riis-Jensen A C, Deilmann T, Olsen T and Thygesen K S
2019 ACS Nano 13 13354
[65] Bychkov Y A and Rashba E I 1984 J. Phys. C: Solid State
Phys. 17 6039
[66] Petersen L and Hedegård P 2000 Surf. Sci. 459 49–56
[67] Bergerhoff G, Brown I and Allen F et al 1987 Int. Union
Crystallogr., Chester 360 77–95
[68] Gražulis S et al 2012 Nucleic Acids Res. 40 D420–7
[69] Qian X, Liu J, Fu L and Li J 2014 Science 346 1344–7
[70] Torelli D, Moustafa H, Jacobsen K W and Olsen T 2020 npj
Comput. Mater. 6 158
[71] Mak K F, Lee C, Hone J, Shan J and Heinz T F 2010 Phys.
Rev. Lett. 105 136805
[72] Splendiani A, Sun L, Zhang Y, Li T, Kim J, Chim C Y,
Galli G and Wang F 2010 Nano Lett. 10 1271–5
[73] Xiao J et al 2020 Nat. Phys. 16 1028–34
[74] Sivadas N, Okamoto S, Xu X, Fennie C J and Xiao D 2018
Nano Lett. 18 7658–64
[75] Liu Y, Wu L, Tong X, Li J, Tao J, Zhu Y and Petrovic C 2019
Sci. Rep. 9 1–8
[76] Yasuda K, Wang X, Watanabe K, Taniguchi T and
Jarillo-Herrero P 2020 (arXiv:2010.06600)
[77] Grimme S, Antony J, Ehrlich S and Krieg H 2010 J. Chem.
Phys. 132 154104
[78] Northup T and Blatt R 2014 Nat. Photon. 8 356–63
[79] O’Brien J, Furusawa A and Vuckovic J 2009 Photonic
quantum technologies nat Photonics 3 687
[80] Zhang S and Northrup J E 1991 Phys. Rev. Lett. 67 2339
26
2D Mater. 8 (2021) 044002 M N Gjerding et al
[81] Van de Walle C G, Laks D, Neumark G and Pantelides S
1993 Phys. Rev. B 47 9425
[82] Janak J F 1978 Phys. Rev. B 18 7165
[83] Pandey M, Rasmussen F A, Kuhar K, Olsen T,
Jacobsen K W and Thygesen K S 2016 Nano Lett. 16 2234–9
[84] Kaappa S, Malola S and Häkkinen H 2018 J. Phys. Chem. A
122 8576–84
[85] Levi G, Ivanov A V and Jonsson H 2020 Faraday Discuss.
224 448-66
[86] Grimme S, Antony J, Ehrlich S and Krieg H 2010 J. Chem.
Phys. 132 154104
[87] Bader R F W 1990 Atoms in Molecules: A Quantum Theory
(The Int. Series of Monographs on Chemistry vol 22)
(Oxford: Clarendon)
[88] Tang W, Sanville E and Henkelman G 2009 J. Phys.:
Condens. Matter. 21 084204
[89] Resta R 1992 Ferroelectrics 136 51–5
[90] King-Smith R D and Vanderbilt D 1993 Phys. Rev. B 47 3
[91] Zhang S and Yu F 2011 J. Am. Ceram. Soc. 94 3153–70
[92] Maeder M D, Damjanovic D and Setter N 2004 J.
Electroceram. 13 385–92
[93] Scott J F 2000 Ferroelectric Memories vol 3 (Berlin:
Springer)
[94] Rangel T, Fregoso B M, Mendoza B S, Morimoto T,
Moore J E and Neaton J B 2017 Phys. Rev. Lett. 119 067402
[95] Resta R and Vanderbilt D 2007 Theory of Polarization: A
Modern Approach Phys. Ferroelectr. vol 105 (Berlin:
Springer) pp 31–68
[96] Gjerding M N, Cavalcante L S R, Chaves A and
Thygesen K S 2020 J. Phys. Chem. C 124 11609–16
[97] Ye Z G 2008 Handbook of Advanced Dielectric, Piezoelectric
and Ferroelectric Materials: Synthesis, Properties and
Applications (Amsterdam: Elsevier)
[98] Ogawa T 2016 Piezoelectric Materials (Croatia: InTech)
[99] Vanderbilt D 1999 J. Phys. Chem. Solids 61 147–51
[100] Authier A 2003 Int. Tables for Crystallography: Volume D:
Physical Properties of Crystals (Dordrecht: Springer)
[101] Duerloo K A N, Ong M T and Reed E J 2012 J. Phys. Chem.
Lett. 3 2871–6
[102] Zhu H et al 2015 Nat. Nanotechnol. 10 151–5
[103] Taherinejad M, Garrity K F and Vanderbilt D 2014 Phys.
Rev. B 89 115102
[104] Olsen T 2016 Phys. Rev. B 94 235106
[105] Fu L 2011 Phys. Rev. Lett. 106 106802
[106] Olsen T, Andersen E, Okugawa T, Torelli D, Deilmann T
and Thygesen K S 2019 Phys. Rev. Mater. 3 024005
[107] Benalcazar W A, Bernevig B A and Hughes T L 2017 Phys.
Rev. B 96 245115
[108] Mermin N D and Wagner H 1966 Phys. Rev. Lett. 17 1133–6
[109] Olsen T 2019 MRS Commun. 9 1142–50
[110] Olsen T 2017 Phys. Rev. B 96 125143
[111] Torelli D and Olsen T 2020 J. Phys.: Condens. Matter.
32 335802
[112] Lado J L and Fern
´
andez-Rossier J 2017 2D Mater.
4 035002
[113] Torelli D, Thygesen K S and Olsen T 2019 2D Mater.
6 045018
[114] Long D A 2002 The Raman Effect: A Unified Treatment of
the Theory of Raman Scattering by Molecules (Chichester:
Wiley)
[115] Taghizadeh A, Leffers U, Pedersen T G and Thygesen K S
2020 Nat. Commun. 11 3011
[116] Lee S Y and Heller E J 1979 J. Chem. Phys. 71 4777
[117] Umari P and Pasquarello A 2003 J. Phys. Condens. Matter
15 S1547–52
[118] Taghizadeh A, Hipolito F and Pedersen T G 2017 Phys. Rev.
B 96 195413
[119] Prylepa A et al 2018 J. Phys. D: Appl. Phys 51 043001
[120] Taghizadeh A, Thygesen K S and Pedersen T G 2021 ACS
Nano 15 7155
[121] Aversa C and Sipe J E 1995 Phys. Rev. B 52 14636–45
[122] Miller R C 1964 Appl. Phys. Lett. 5 17–19
[123] Schmidt J, Marques M R G, Botti S and Marques M A L
2019 Computat. Mater. 5 83
[124] Zhuo Y, Mansouri Tehrani A and Brgoch J 2018 J. Phys.
Chem. Lett. 9 1668–73
[125] Faber F, Lindmaa A, von Lilienfeld O A and Armiento R
2015 Int. J. Quantum Chem. 115 1094–101
[126] Ward L, Agrawal A, Choudhary A and Wolverton C 2016
Computat. Mater. 2 1–7
[127] Ghiringhelli L M, Vybiral J, Levchenko S V, Draxl C and
Scheffler M 2015 Phys. Rev. Lett. 114 105503
[128] Rupp M, Tkatchenko A, Müller K R and Von Lilienfeld A
2012 Phys. Rev. Lett. 108 058301
[129] Huo H and Rupp M 2018 Unified representation of
molecules and crystals for machine learning
(arXiv:1704.06439)
[130] Jorgensen P B, Mesta M, Shil S, García Lastra J M,
Jacobsen K W, Thygesen K S and Schmidt M N 2018 J.
Chem. Phys. 148 241735
[131] Rajan A C, Mishra A, Satsangi S, Vaish R, Mizuseki H,
Lee K R and Singh A K 2018 Chem. Mater. 30 4031–8
27